Next Article in Journal
Hybrid Multi-Objective Artificial Bee Colony for Flexible Assembly Job Shop with Learning Effect
Previous Article in Journal
Structure-Preserving Low-Rank Model Reduction for Second-Order Time-Delay Systems
Previous Article in Special Issue
Generalizations of the Kantorovich and Wielandt Inequalities with Applications to Statistics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimation for Partial Functional Multiplicative Regression Model

School of Mathematics and Computer Science, Shanxi Normal University, Taiyuan 031031, China
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(3), 471; https://doi.org/10.3390/math13030471
Submission received: 10 December 2024 / Revised: 29 December 2024 / Accepted: 16 January 2025 / Published: 31 January 2025
(This article belongs to the Special Issue New Advances in High-Dimensional and Non-asymptotic Statistics)

Abstract

:
Functional data such as curves, shapes, and manifolds have become more and more common with modern technological advancements. The multiplicative regression model is well suited for analyzing data with positive responses. In this study, we study the estimation problems of the partial functional multiplicative regression model (PFMRM) based on the least absolute relative error (LARE) criterion and least product relative error (LPRE) criterion. The functional predictor and slope function are approximated by the functional principal component basis functions. Under certain regularity conditions, we derive the convergence rate of the slope function and establish the asymptotic normality of the slope vector for two estimation methods. Monte Carlo simulations are carried out to evaluate the proposed methods, and an application to Tecator data is investigated for illustration.

1. Introduction

We consider the PFMRM which includes some scalar covariates and a functional predictor, paired with a scalar response. That is,
Y = exp Z α + T β ( t ) X ( t ) d t ϵ ,
where Y is a positive response variable; Z = ( Z 1 , , Z p ) is a p-dimensional vector covariate, α = ( α 1 , , α p ) is the p-vector of slope coefficients, in which p is assumed to be fixed; β ( t ) L 2 ( T ) is the unknown slope function associated with functional predictor X ( t ) ; and ϵ is a positive random error independent of Z and X ( t ) . Here, the Hilbert space L 2 ( T ) is the set of all square integrable functions on T, endowed with the inner product x , Y = T x ( t ) Y ( t ) d t and the norm x = x , x 1 / 2 . The model (1) generalizes both the classic multiplicative regression model [1] and the functional multiplicative regression model [2] that correspond to the cases of β ( t ) = 0 and α = 0 , respectively. When the log transformation applies to this model, the above model simply degrades to a partial functional linear regression model [3]. When the response variable Y is a failure time, model (1) is called the functional accelerated failure time model in survival analysis; see [4] for example. For simplicity of notation, we assume that T = [ 0 , 1 ] , and Z and X ( t ) have zero mean throughout the study.
In many applications, the response variable is positive, for example, survival time, stock prices, income, body fat level, emissions of nitrogen oxides, and the value of owner-occupied homes frequently arise in statistical practice. The multiplicative regression model plays an important role in describing these types of data. To estimate the multiplicative regression models, Refs. [1,5] proposed LARE and LPRE estimation, respectively. The LARE criterion minimizes i = 1 n ϵ i 1 1 + ϵ i 1 , and the LPRE criterion minimizes i = 1 n ϵ i 1 1 × ϵ i 1 , which is equivalent to minimizing i = 1 n ϵ i 1 + ϵ i . As pointed out by [5], the LARE estimation is robust and scale-free, but optimization of its use may be challenging as the objective function minimized is non-smoothing. In addition, confidence intervals for parameters are not very accurate due to the complexity of the asymptotic covariance matrix, which involves the density of the model error. In order to overcome the shortcoming of LARE, Ref. [5] proposed the LPRE criterion, which is strictly convex and infinitely differentiable, and the optimization procedure is much easier. In recent years, due to the excellent properties of LARE and LPRE estimation, scholars in various fields have been attracted to conducting extended research on them. The readers can refer to [6,7,8].
For functional multiplicative models, to the best of our knowledge, there are only a few works and all of them focus on the above two criteria. For example, Ref. [9] developed the functional quadratic multiplicative model and derived the asymptotic properties of the estimator with the LARE criterion. Later, Refs. [2,10] considered the variable selection for partially and locally sparse functional linear multiplicative models based on the LARE criterion. In this paper, we consider the modeling of a positive scalar response variable with both scalar and functional predictors under the PFMRM. The above two criteria are employed to estimate the parametric vector α and the slope function β ( t ) in model (1).
The major contributions of this paper are four-fold. First, this study first extends the LPRE criterion to the estimation of functional regression models. Second, we estimate the unknown slope function and functional predictor by using a functional principal component analysis technique, derive the convergence rates of the slope function, and establish the asymptotic normality of the parameter vector under mild regularity conditions for two estimation methods. Third, we develop an iterative algorithm to solve the involved optimization problem and propose a data-driven procedure to select the tuning parameters. Finally, we conduct extensive numerical studies to examine the finite sample performance of the proposed methods and find that the LPRE method has better performance than the LARE, least square, and least absolute deviation methods.
The rest of the article is organized as follows. Section 2 describes the detailed estimation procedures for model (1). Section 3 is dedicated to the asymptotic study of our estimators. The feasible algorithm for estimations of the parameters and nonparametric functions of PFMRM is proposed based on the LPRE criterion and presented in Section 4. Section 5 conducts simulation studies to evaluate the finite sample performance of the proposed methods. In Section 6, we apply the proposed method to the Tecator data. The article concludes with a discussion in Section 7. Proofs are provided in Appendix A.

2. Estimation Method

Let Z i , X i ( · ) , Y i , ϵ i , i { 1 , , n } be the independent realizations of Z , X ( · ) , Y , ϵ generated from model (1), that is,
Y i = exp Z i θ + 0 1 β ( t ) X i ( t ) d t ϵ i , i { 1 , , n } ,
where random errors ϵ i , i { 1 , , n } are independent and identically distributed (i.i.d.) and independent of Z i and X i ( · ) .
The covariance and empirical covariance functions of X ( · ) can be defined as
C X ( t , s ) = Cov ( X ( t ) , X ( s ) ) , C ^ X ( t , s ) = 1 n i = 1 n X i ( t ) X i ( s ) .
According to Mercer’s theorem, the spectral expansions of C X and C ^ X can be written as
C X ( t , s ) = j = 1 λ j v j ( t ) v j ( s ) , C ^ X ( t , s ) = j = 1 λ ^ j v ^ j ( t ) v ^ j ( s ) ,
where λ 1 > λ 2 > > 0 and λ ^ 1 λ ^ 2 λ ^ n + 1 = = 0 are the ordered eigenvalue sequences of the linear operators with kernels C X and C ^ X , respectively, and { v j ( · ) } j = 1 and { v ^ j ( · ) } j = 1 are appropriate orthonormal eigenfunction sequences. With a slight abuse of notation, we use C X to denote both the covariance operator and the covariance function of X ( · ) . We assume that the covariance operator C X defined by C X f ( s ) = 0 1 C X ( t , s ) f ( t ) d t is strictly positive. In addition, v ^ j ( · ) , λ ^ j can be regarded as an estimator of v j ( · ) , λ j .
On the basis of the Karhunen–Loève decomposition, β ( t ) and X i ( t ) can be expanded to
β ( t ) = j = 1 γ j v j ( t ) , X i ( t ) = j = 1 ξ i j v j ( t ) , i { 1 , , n } ,
where γ j = β ( · ) , v j ( · ) = 0 1 β ( t ) v j ( t ) d t , and ξ i j = X i ( · ) , v j ( · ) represents the coordinate of the ith curve with respect to the jth eigenbasis.
Analogously, we define C Y X ( · ) = Cov ( Y , X ( · ) ) , C Z = Var ( Z ) , C Z Y = Cov ( Z , Y ) , and C Z X ( · ) = Cov ( Z , X ( · ) ) = ( C Z 1 X ( · ) , , C Z p X ( · ) ) . Then, their corresponding empirical counterparts can be defined as
C ^ Y X = 1 n i = 1 n Y i X i , C ^ Z = 1 n i = 1 n Z i Z i , C ^ Z Y = 1 n i = 1 n Z i Y i , C ^ Z X = 1 n i = 1 n Z i X i .
Given the orthogonality of { v 1 ( · ) , , v m ( · ) } and (3), model (2) can be rewritten as
Y i = exp Z i α + j = 1 m γ j ξ i j ϵ ˜ i = exp ( Z i α + U i γ ) ϵ ˜ i , ( i = 1 , , n )
where U i = ( ξ i 1 , , ξ i m ) , γ = ( γ 1 , , γ m ) , ϵ ˜ i = exp ( j = m + 1 γ j ξ i j ) ϵ i , and the truncation parameter m as n .

2.1. LARE Estimation

This is based on the LARE criterion of [1] and υ j ( · ) is replaced with its estimator υ ^ j ( · ) . The LARE estimates of model (1) can be obtained by minimizing the following loss functions:
( α ^ , γ ^ ) arg min ( α , γ ) i = 1 n | Y i exp ( Z i α + U ^ i γ ) Y i | + | Y i exp ( Z i α + U ^ i γ ) exp ( Z i α + U ^ i γ ) | ,
where U ^ i = ( ξ ^ i 1 , , ξ ^ i m ) with ξ ^ i j = X i ( · ) , υ ^ j ( · ) , γ ^ = ( γ ^ 1 , , γ ^ m ) . Moreover, we can obtain the LARE estimator β ^ ( t ) = j = 1 m γ ^ j υ ^ j ( t ) .

2.2. LPRE Estimation

This is based on the LPRE criterion of [5] and υ j ( · ) is replaced with its estimator υ ^ j ( · ) . The LPRE estimates of model (1) can be obtained by minimizing the following loss functions:
( α ˜ , γ ˜ ) arg min ( α , γ ) i = 1 n Y i exp ( Z i α U ^ i γ ) + Y i 1 exp ( Z i α + U ^ i γ ) 2 ,
where γ ˜ = ( γ ˜ 1 , , γ ˜ m ) . Moreover, we can obtain the LPRE estimator β ˜ ( t ) = j = 1 m γ ˜ j υ ^ j ( t ) .

3. Asymptotic Properties

In this section, we establish the asymptotic properties of the estimators. Formulating the results requires the following technical assumptions. Firstly, we present some notations. Suppose that α 0 and β 0 ( t ) are the true values of α and β ( t ) , respectively, and let γ 0 = ( γ 01 , , γ 0 m ) be the true score coefficient vector. The notation · denotes the norm L 2 for a function or the Euclidean norm for a vector. In what follows, c denotes a generic positive constant that may take various values. Moreover, a n b n implies that a n / b n is bounded away from zero and infinity as n .
C1.
The random process X ( · ) and the score ξ i satisfy the following conditions:
E | | X ( · ) | | 4 , E ( ξ i ) 4 c λ i 2 ,   i 1 .
C2.
For the eigenvalues of the linear operator C X and the score coefficients, the following conditions hold:
( a ) There exist some constants c and a > 1 such that c 1 i a λ i c i a ,   λ i λ i + 1 c i a 1 , i 1 ;
( b ) There exist some constants c and b > a / 2 + 1 such that | γ j |     c j b , j 1 .
C3.
The tuning parameter m n 1 / ( a + 2 b ) .
C4.
For the random vector Z , E Z 4 < .
C5.
There exists some constant c such that C Z l X , v j c j ( a + b ) , l = 1 , ,   p , j 1 .
C6.
Let η i k = Z i k g k , X i with g k = j = 1 λ j 1 C Z k X , v j v j , for each k, then η 1 k , , η n k are independent and identically distributed random variables. Assume that
E [ η 1 k | X 1 , , X n ] = 0 , E [ η 1 k 2 | X 1 , , X n ] = Σ k k ,
where Σ k k is the kth diagonal element of Σ = E ( η i η i ) with η i = ( η i 1 , , η i p ) , and Σ is a positive definite matrix.
C7.
The error ϵ has a continuous density f ( · ) in a neighborhood of 1, and is independent of ( Z , X ( · ) ) .
C8.
P ( ϵ > 0 ) = 1 , E ( ϵ + ϵ 1 ) < , E [ ( ϵ + ϵ 1 ) sgn ( ϵ 1 ) ] = 0 and E [ ( ϵ 1 + ϵ ) 2 ] < .
C9.
E [ ( ϵ ϵ 1 ) ] = 0 , E [ ( ϵ 1 + ϵ ) 2 ] < .
Remark 1. 
C1–C3 are standard assumptions used in classical functional linear regression (see, e.g., [11,12]). More specifically, C1 is needed for the consistency of C ^ X ( t , s ) . C2(a) is required to identify the slope function β ( t ) by preventing the spacing between the eigenvalues from being too small, while C2 (b) is used to make the slope function β ( t ) sufficiently smooth. C3 is required to obtain the convergence rate of the slope function β ( t ) . C4–C6 are used to handle the linear part of the vector-type covariate in the model, which are similar to [3,13]. C4 is a little stronger than those in classical linear models and is primarily used to ensure the asymptotic behavior of C ^ Z X and C ^ Z . C5 makes the effect of truncation on the estimation of β ( · ) small enough. Notably, η i k is the regression error of Z i k on X i ( · ) , and the conditions on η s in C6 essentially restrict that Z can only be linearly related to X ( · ) . C6 is also used to establish the asymptotic normality of the parameter estimator, in a similar manner to that applied in [3,13] for modeling the dependence between parametric and nonparametric components. C7–C8 are standard assumptions on random errors of the LARE estimator used in [1]. C9 is the standard assumption for random errors in the LPRE estimator used in [5].
The following two theorems present the convergence rate of the slope function estimator β ^ ( · ) and establish the asymptotic normality of the parameter estimator α ^ , respectively, with the LARE method introduced in Section 2.1 above.
Theorem 1. 
If conditions C1–C8 hold, then
β ^ ( · ) β 0 ( · ) 2 = O p ( n 2 b 1 ( a + 2 b ) ) .
Theorem 2. 
If conditions C1–C8 hold, as n , we have
n ( α ^ α 0 ) D N 0 , 1 4 J + 2 f ( 1 ) 2 Σ 1 A ,
where D represents convergence in distribution, A = E ( ϵ + ϵ 1 ) 2 , J = E ϵ sgn ( ϵ 1 ) .
The following two theorems give the rate of convergence of the slope function and the asymptotic normality of the parameter vector, respectively, with the LPRE method introduced in Section 2.1 above.
Theorem 3. 
Suppose conditions C1–C6 and C9 hold; then,
β ˜ ( · ) β 0 ( · ) 2 = O p ( n 2 b 1 ( a + 2 b ) ) .
Theorem 4. 
Suppose conditions C1–C6 and C9 hold; as n , we have
n ( α ˜ α 0 ) D N ( 0 , Σ 1 1 Σ 1 Σ 2 ) ,
where Σ 1 = E ( ϵ + ϵ 1 ) 2 , Σ 2 = E ( ϵ ϵ 1 ) 2 .
Remark 2. 
The convergence rate of the slope function β ( t ) obtained in Theorems 1 and 3 is the same as that of [12,13], which is optimal in the minimax sense. The variance in Theorems 2 and 4 involves the random error density function, which is the standard feature of multiplicative regression models. One can consult Theorem 3.2 of [14] for more details.

4. Implementation

Considering that the minimization problems of the LARE method are the special cases of the LPRE procedure, we only provide a detailed implementation of the LPRE approach. Specially, we use the Newton–Raphson iterative algorithm to solve the LPRE problem in Equation (5). Let ζ i = ( Z i , U ^ i ) , θ = ( α , γ ) ; then, Q ( θ ) = i = 1 n | Y i exp ( ζ i θ ) Y i | + | Y i exp ( ζ i θ ) exp ( ζ i θ ) | . Then, the computation can be implemented as follows:
Step 1 Initialization step. In this paper, the least squares estimator θ ˜ ( 0 ) is chosen as the initial estimator.
Step 2 Update the estimator θ ˜ ( k ) of θ by using the following iterative procedure:
θ ˜ ( k + 1 ) = θ ˜ ( k ) 2 Q ( θ ˜ ( k ) ) 1 × Q ( θ ˜ ( k ) ) ,
where Q ( θ ˜ ( k ) ) = Q ( θ ) θ | θ = θ ˜ ( k ) and 2 Q ( θ ^ ( k ) ) = 2 Q ( θ ) θ θ | θ = θ ˜ ( k ) represent the gradient and Hessian matrix of Q ( θ ) at θ ˜ ( k ) , respectively.
Step 3 Step 2 is repeated until convergence. We use the L 2 norm of the difference between two consecutive estimates less than 10 6 as the convergence criterion. Note that [8] proposed a profiled LPRE method in partial linear multiplicative models. The algorithm in [8] requires that E ( ln ε ) = 0 and E ε ε 1 = 0 hold simultaneously. Since the LPRE objective functions (5) are infinitely differentiable and strict, the prposed Newton–Raphson method can relax the restriction E ( ln ε ) = 0 . Moreover, the minimizer of the objective function (5) is just the root of its first derivative. We will express the final LPRE estimator of θ as θ ˜ . Furthermore, the LPRE estimator of the slope function is indicated by β ˜ ( t ) = j = 1 m γ ˜ j υ ^ j ( t ) .

5. Simulation Studies

In this section, the finite sample properties of the proposed estimation methods are investigated through Monte Carlo simulation studies. We compare the performance of the two proposed methods with the least absolute deviations (LAD) method in [15] and the least squares (LS) method in [3], where both the LS and LAD estimates are based on the logarithmic transformation on the two sides of the following model (6). The sample size n is set as 150, 300, and 600. And the datasets are generated from the following model:
Y = exp Z 1 α 1 + Z 2 α 2 + 0 1 X ( t ) β ( t ) d t ϵ ,
where Z 1 follows the standard normal distribution, Z 2 follows the Bernoulli distribution with a probability of 0.5, and α = ( α 1 , α 2 ) = ( 2 , 1 ) . For the functional linear component, we use a similar setting to that used by [13] to set β ( t ) = 2 sin ( π t / 2 ) + 3 2 sin ( 3 π t / 2 ) and X ( t ) = j = 1 50 ξ j υ j ( t ) , where υ j ( t ) = 2 sin ( ( j 0.5 ) π t ) , and ξ j s are independently distributed according to the normal distribution with mean 0 and variance λ j = 10 ( j 0.5 ) π 2 for j 1 . Similar to [1], the random error ϵ is considered from the following three distributions: (i) log ( ϵ ) 0.1 N ( 0 , 1 ) , (ii) log ( ϵ ) U ( 2 , 2 ) , and (iii) ϵ U ( 0.5 , a ) ; the choice of a satisfies E ( ϵ ) = E ( ϵ 1 ) .
Implementing the proposed estimation method requires the tuning parameter m. Here, m is selected as the minimum value that reaches a certain proportion (denoted by δ ) of the cumulative percentage of total variance (CPV) by the first m leading components as follows:
m = arg min K k = 1 K λ ^ k / k = 1 M λ ^ k δ ,
where M is the largest number of functional principle components, such that λ ^ k > 0 , and δ = 90 % is used in this study.
Based on 500 replications, Table 1 summarizes the performance of different estimators in terms of bias (Bias) and standard deviation (Sd) of the estimated α 1 and α 2 , as well as the mean squared error (MSE) of the estimated α . Table 2 provides the root average square errors (RASEs) of the estimated β ( t ) for LARE estimation, where the RASE is defined as follows:
RASE = 1 d k = 1 d β ^ ( t k ) β ( t k ) 2 ,
where t k ,   k = 1 , ,   d are equally spaced grids to calculate the value of function β ( t ) , and we take d = 200 in this simulation. We compute the RASE for each replicate observation and obtain the average. In addition, the definitions of RASE for the LPRE, LAD, and LS methods are similar, we just replaced β ^ ( t ) with the corresponding estimators.
From Table 1 and Table 2, we have the following observations: (a) Sd, MSE, and RASE decrease and the estimation performance improves as sample size n increases from 150 to 600. The estimates of the parametric covariate effects are basically unbiased and close to their true values, indicating that our proposed approaches produce consistent estimators. (b) When log ( ϵ ) follows the normal random error, as expected, both LS and LPRE perform the best, and LAD performs the worst. (c) When log ( ϵ ) follows U ( 2 , 2 ) , LPRE performs the best, LARE also performs well, and LAD still performs the worst. (d) log ( ϵ ) follows U ( 0.5 , a ) . Note that the random error violates condition C8 E ϵ + ϵ 1 sgn ( ϵ 1 ) = 0 for the LARE method, which implies that the random error of zero mean in the least squares or of zero median in the LAD regression does not hold. Meanwhile, the LPRE method works well in the case of E ε ε 1 ) = 0 . LPRE performs considerably better than LARE and LAD; this indicates that LPRE is much more robust than the LARE and LAD methods. In summary, LPRE performs the best in almost all the scenarios considered, confirming its superiority to LARE and other competing methods.

6. Application to Tecator Data

In this section, we introduce the application of the proposed estimation methods to Tecator data. The dataset is contained in the R package fda.usc in [16], and includes 215 independent food samples with fat, protein, and water of meat measured in percent. It has been widely used in the analysis of functional data. The Tecator data consist of a 100-channel spectrum of absorbances working in the wavelength from 850 to 1050 nanometers (nm). Further details on the data can be found in [2,9]. The purpose is to tease out the relation among the quantity of fatty Y (response), protein content Z 1 , and water content Z 2 (real random variables), and the spectrometric curve X ( t ) (a functional predictor). To predict the fat content of a meat sample, we consider the following PFMRM:
Y = exp Z 1 α 1 + Z 2 α 2 + 850 1050 X ( t ) β ( t ) d t ϵ .
To assess the predictive capability of the proposed methods, we followed [13] to randomly divide the sample into two subsamples: I 1 = ( X i , Y i , Z i , U i ) , | I 1 | = 165 as the training sample, where | I 1 | represents the base of I 1 , and I 2 = ( X i , Y i , Z i , U i ) , | I 2 | = 50 as the testing sample. The training and testing samples were used to estimate parameters and check the accuracy of the prediction, respectively. We used the mean quadratic prediction error (MQEP) as a criterion to evaluate the performance of various estimation procedures. The MQEP is defined as follows:
MQEP = 1 N j I 2 ( Y j Y ^ j ) 2 / Var I 2 ( Y j ) ,
where Y ^ i is predicted based on the training sample, and Var I 2 is a response variable from test sample variance.
In addition, we compare the performances of the proposed model with the partial functional regression model in Shin [3], and log transformation on two sides of model (7) (denoted as “LogPFLM”). Specifically,
PFLM : Y = Z 1 α 1 + Z 2 α 2 + 850 1050 X ( t ) β ( t ) d t + e logPFLM : ln Y = Z 1 α 1 + Z 2 α 2 + 850 1050 X ( t ) β ( t ) d t + ε
The CPV criterion introduced in Section 4 was used to determine the cutoff parameter m. Here, m = 3 was selected to explain approximately 95% of the variance in the Tecator data. Table 3 shows the average MQEP of N times repeated operations. The first and second rows of Table 3 show the prediction results of the logPFLM using LS and LAD methods, respectively. The third and fourth rows give the prediction results of (7) under the LARE and LPRE methods, respectively. The final row presents the prediction results of the PFLM without logarithmic transformation by the LS method. Overall, the LPRE outperforms all other competing methods regardless of the number of random splits. LS performs the second best, whereas LAD performs the worst. In addition, we employed the above models and methods to Tecator data just considering the scalar predictors or functional predictors; and the results indicated relatively poor performance, so we have not reported them.
Then, we used the best-performing LPRE method to estimate the unknown parameters based on the entire dataset. The estimates of α 1 and α 2 are 0.102 and 0.010 , respectively. Both protein and water are positively associated with the logarithmic transformation of fatty. Figure 1 depicts the estimated β ( t ) . In general, the spectrometric curve has a positive effect on the logarithmic transformation of fatty, and the estimated curve β ( t ) is small due to the large integral domain. The advantages of the LPRE method are particularly evident in the analysis of this dataset.

7. Conclusions

In this paper, we study the estimated problems of PFMRM based on the LARE and LPRE criteria, and the unknown slope function and functional predictor are approximated by the functional principal component analysis technique. Under some regularity conditions, we obtain the convergence rates of the slope function and the asymptotic normality of the parameter vector for two estimation methods. Both the numerical simulation results and the real data analysis show that the LPRE method is superior to the LARE, least square, and least absolute deviation methods. Several issues still warrant further study. First, we choose the Karhunen–Loève expansion to approximate the slope function in this article. Other nonparametric smoothing techniques, such as B-spline, kernel estimation, and penalty regression spline, can be used in our proposed LARE and LPRE estimation methods, and the large sample characteristics and limited sample comparison are worth studying. Furthermore, the proposed methods can also be extended to more general situations, including, but not limited to, dependent functional data, partially observed functional data, and multivariate functional data. Substantial efforts must be devoted to related advances in the future.

Author Contributions

Conceptualization, X.L. and P.Y.; methodology and proof, X.L. and P.Y.; numerical study, X.L. and J.S.; writing—original draft preparation, P.Y. and J.S.; writing—review and editing, X.L., P.Y. and J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (12401356), the Natural Science Foundation of Shanxi Province (20210302124262, 20210302124530, 202203021222223), the National Statistical Science Research Project of China (2022LY089), and the Natural Science Foundation of Shanxi normal University (JYCJ2022004).

Data Availability Statement

Researchers can download the Tecator dataset from the R package “fda.usc”.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
PFMRMpartial functional multiplicative regression model
PFLMpartial functional regression model
LAREleast absolute relative error
LPREleast product relative error
LADleast absolute deviations
LSleast square
RASEroot average square error
MQEPmean quadratic error of prediction
MSEmean squared error
CPVCumulative Percentage of Variance

Appendix A

In this appendix, we provide the technical proofs for the results presented in Section 3.
Proof of Theorem 1. 
Let δ n = n 2 b 1 2 ( a + 2 b ) , S n = δ n 1 ( α ^ α 0 ) , V n = δ n 1 ( γ ^ γ 0 ) , r i = 0 1 β 0 ( t ) X i ( t ) d t U ^ i γ 0 , and F n = ( S n , V n ) : ( S n , V n ) = L , where L is a large enough constant. Next, we show that, for any given η > 0 , there exists a sufficiently large constant L = L η , such that
P inf ( S n , V n ) F n Q n α 0 + δ n S n , γ 0 + δ n V n > Q n α 0 , γ 0 1 η .
This implies that with the probability of at least 1 η there exists a local minimizer α ^ and γ ^ in the ball ( S n , V n ) : ( S n , V n ) = L } , such that α ^ α 0   =   O p ( δ n ) , γ ^ γ 0 =   O p ( δ n ) .
By using υ ^ j υ j 2 = O p ( n 1 j 2 ) (see, e.g., [3,13]), we have
r i 2 = 0 1 β 0 ( t ) X i ( t ) d t U ^ i γ 0 2 2 j = 1 m X i , υ ^ j υ j γ 0 j 2 + 2 j = m + 1 X i , υ j γ 0 j 2 2 A 1 + 2 A 2 .
For A 1 , by conditions C1, C2, and the Hölder inequality, we can obtain
A 1 = j = 1 m X i , υ ^ j υ j γ 0 j 2     c m j = 1 m υ j υ ^ j 2 γ 0 j 2 c m j = 1 m O p ( n 1 j 2 2 b ) = O p ( n 1 + 4 b 4 a + 2 b ) = o p ( δ n 2 ) .
For A 2 , given that
E j = m + 1 X i , υ j γ 0 j = 0 ,
Var j = m + 1 X i , υ j γ 0 j = j = m + 1 λ j γ 0 j 2 c j = m + 1 j ( a + 2 b ) = O ( n a + 2 b 1 a + 2 b ) .
We have
A 2 = O p ( n a + 2 b 1 a + 2 b ) = o p ( δ n 2 ) .
Taking these together, we obtain
r i 2 = O p ( n a + 2 b 1 a + 2 b ) = o p ( δ n 2 ) .
Furthermore, a simple calculation yields
ψ n ( S n , V n ) Q n ( α 0 + δ n S n , γ 0 + δ n V n ) Q n ( α 0 , γ 0 ) = i = 1 n { | 1 Y i 1 exp Z i ( α 0 + δ n S n ) + U ^ i ( γ 0 + δ n V n ) | | 1 Y i 1 exp ( Z i α 0 + U ^ i γ 0 ) | } + i = 1 n | 1 Y i exp Z i ( α 0 + δ n S n + U ^ i ( γ 0 + δ n V n ) | | 1 Y i exp Z i α 0 + U ^ i γ 0 | } I 1 + I 2 .
For I 1 , by the Knight identity (see, e.g., [17]), we have
| x y | | x | = y [ I ( x > 0 ) I ( x < 0 ) ] + 2 0 y [ I ( x s ) I ( x 0 ) ] d s . x 0
By routine calculation, we have
I 1 = i = 1 n ω 1 i I 1 Y i 1 exp Z i α 0 + U ^ i γ 0 > 0 I 1 Y i 1 exp Z i α 0 + U ^ i γ 0 < 0 + 2 i = 1 n 0 ω 1 i I 1 Y i 1 exp Z i α 0 + U ^ i γ 0 s I 1 Y i 1 exp Z i α 0 + U ^ i γ 0 0 d s I 11 + I 12 ,
where ω 1 i = Y i 1 exp Z i ( α 0 + δ n S n + U ^ i ( γ 0 + δ n V n ) exp Z i α 0 + U ^ i γ 0 } .
Using the Taylor series expansion, we have
I 11 = δ n i = 1 n { S n Z i Y i 1 exp ( Z i α 0 ) + V n U ^ i Y i 1 exp ( U ^ i γ 0 ) } sgn ( 1 Y i 1 exp ( Z i α 0 + U ^ i γ 0 ) ) 1 2 δ n 2 i = 1 n Y i 1 S n Z i Z i S n exp ξ i [ 1 ] + V n U ^ i U ^ i V n exp ξ i [ 2 ] sgn 1 Y i 1 exp Z i α 0 + U ^ i γ 0 I 111 + I 112 ,
where ξ i [ 1 ] is between Z i ( α 0 + δ n S n ) and Z i α 0 , and ξ i [ 2 ] is between U ^ i ( γ 0 + δ n V n ) and U ^ i γ 0 .
For I 111 , we have
I 111 = i = 1 n δ n 1 ϵ i S n Z i + V n U ^ i sgn 1 Y i 1 exp Z i α 0 + U ^ i γ 0 + i = 1 n 1 ϵ i δ n 2 S n Z i Z i S n + V n U ^ i U ^ V n sgn 1 Y i 1 exp Z i α 0 + U ^ i γ 0 + o p 1 I 1111 + I 1112 + o p 1 .
It follows that I 1111 = o p 1 S n 2 + o p 1 V n 2 , I 1112 = O p δ n 2 S n 2 + O p δ n 2 V n 2 , and I 112 = O p δ n 2 S n 2 + O p δ n 2 V n 2 . Therefore,
I 11 = O p δ n 2 S n 2 + O p δ n 2 V n 2 .
For I 12 , define c 1 i = exp δ n S n + δ n V n r i , c 2 i = exp ( U ^ i γ 0 U i γ 0 r i ) , and τ = s ϵ i ; then,
I 12 = 2 i = 1 n 0 ω 1 i [ I 1 ϵ i 1 c 2 i s I 1 ϵ i 1 c 2 i 0 ] d s = 2 i = 1 n 0 c 1 i c 2 i ϵ i 1 I ϵ i c 2 i + τ I ϵ i c 2 i d τ = 2 i = 1 n 0 c 1 i c 2 i E ϵ | X ϵ i 1 [ I ϵ i c 2 i + τ I ϵ i c 2 i ] d τ + o p 1 = 2 i = 1 n 0 c 1 i c 2 i E ϵ | X I ϵ i c 2 i + τ I ϵ i c 2 i d τ + 2 i = 1 n 0 c 1 i c 2 i E ϵ | X ϵ i 1 1 [ I ϵ i c 2 i + τ I ϵ i c 2 i ] d τ + o p 1 = δ n 2 i = 1 n S n Z i Z i S n + V n T U ^ i U ^ i V n 1 + o p 1 = O p ( δ n 2 ) S n 2 + O p ( δ n 2 ) V n 2 + o p 1 .
Therefore,
I 1 = O p δ n 2 S n 2 + O p δ n 2 V n 2 .
Similarly, we can prove that
I 2 = O p δ n 2 S n 2 + O p δ n 2 V n 2 .
Then, Equation (A1) holds, and there exists a local minimizer γ ^ , such that γ ^ γ 0 = O p δ n . Note that
β ^ ( t ) β 0 ( t ) 2 = j = 1 m γ ^ j υ ^ j ( t ) j = 1 γ 0 j υ j ( t ) 2 2 j = 1 m γ ^ j υ ^ j ( t ) j = 1 γ 0 j υ j ( t ) 2 + 2 j = m + 1 γ 0 j υ j ( t ) 2 4 j = 1 m ( γ ^ j γ 0 j ) υ ^ j ( t ) 2 + 4 j = 1 m γ 0 j ( υ ^ j ( t ) υ j ( t ) ) 2 + 2 j = m + 1 γ 0 j 2 4 D 1 + 4 D 2 + 2 D 3 .
According to Equation (A2), condition C2, the orthogonality of υ ^ j , and υ j ( t ) υ ^ j ( t ) 2 = O p ( n 1 j 2 ) , we have
D 1 = j = 1 m ( γ ^ j γ 0 j ) υ ^ ( t ) 2 j = 1 m | γ ^ j γ 0 j | 2 = γ ^ γ ^ 0 2 = O p ( δ n 2 ) .
D 2 = j = 1 m γ 0 j ( υ ^ j ( t ) υ j ( t ) ) 2 m j = 1 m υ ^ j ( t ) υ j ( t ) 2 γ 0 j 2 m n O p ( j = 1 m j 2 γ 0 j 2 ) = O p ( n 1 m j = 1 m j 2 2 b ) = O p ( n 1 m ) = o p ( n 2 b 1 a + 2 b ) = o p ( δ n 2 ) .
D 3 = j = m + 1 γ 0 j 2 C j = m + 1 j 2 b = O ( n 2 b 1 a + 2 b ) = O ( δ n 2 ) .
Then, Theorem 1 follows immediately from (A3)–(A5). □
Proof of Theorem 2. 
Firstly, let
ψ n ( α , γ ) i = 1 n { | 1 ϵ i 1 exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i | + | 1 ϵ i exp Z i ( α α 0 ) U ^ i ( γ γ 0 ) + r i | } .
According to the convexity lemma of [18] and lemma A.1 in [1], for any compact sets Θ and , as n , we have
sup α Θ γ 1 n | ψ n ( α , γ ) E ψ n ( α , γ ) | p 0 .
With a simple calculation, we have
E ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) = i = 1 n E { | 1 ϵ i 1 exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i | + | 1 ϵ i exp ( Z i ( α α 0 ) U ^ i ( γ γ 0 ) + r i ) | | 1 ϵ i 1 | | 1 ϵ i | } + o p 1 = i = 1 n E ( ϵ i + ϵ i 1 ) sgn ( 1 ϵ i ) exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i 1 + i = 1 n E { ϵ i sgn ( ϵ i 1 ) [ exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i + exp ( Z i ( α α 0 ) U ^ i ( γ γ 0 ) + r i ) 2 ] } + 2 i = 1 n E { [ I ϵ i exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i I ( ϵ i 1 ) ] × [ ϵ i 1 exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i ϵ i exp ( Z i ( θ θ 0 ) U ^ i ( γ γ 0 ) + r i ) ] } + o p 1 .
According to condition C8, we can obtain that the sum of first term in Equation (A6) is 0. Further, by condition C8, we have
2 E ϵ I ( ϵ > 1 ) > E ( ϵ + ϵ 1 ) I ( ϵ > 1 ) = E ( ϵ + ϵ 1 ) I ( ϵ 1 ) > 2 E ϵ I ( ϵ 1 ) .
This means J = E ϵ sgn ( ϵ 1 ) > 0 , and the second term in Equation (A6) is non-negative. And it is easy to prove that the third term in Equation (A6) is also non-negative. Thus, for all α and γ , we have E ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) 0 . In addition, we have E ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) = o p ( 1 ) , i = 1 n E ϵ i sgn ( ϵ i 1 ) × exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i + exp Z i ( α α 0 ) U ^ i ( γ γ 0 ) + r i 2 = o p ( 1 ) . And then α = α 0 , γ = γ 0 is a unique minimum point of exp ( Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i + exp Z i ( α α 0 ) U ^ i ( γ γ 0 ) + r i ( 1 + o p ( 1 ) ) . According to condition C8 and E ϵ sgn ( ϵ 1 ) > 0 , we can obtain that α = α 0 , γ = γ 0 is a minimizer of E ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) . Let ψ ( α , γ ) n 1 E ψ n ( α , γ ) . Then, for each δ 1 > 0 , δ 2 > 0 , there exist η > 0 , such that α α 0 δ 1 , γ γ 0 δ 2 , ψ ( α , γ ) > ψ ( α 0 , γ 0 ) + η .
For any constant δ 1 , δ 2 > 0 , and C 1 , C 2 , let ( α n * , γ n * ) be a minimizer of ψ n ( α , γ ) such that δ 1 α n * α 0 C 1 , δ 2 γ n * γ 0 C 2 . According to (A6), as n , we have ψ n ( α n * , γ n * ) p ψ ( α n * , γ n * ) , and ψ n ( α n * , γ n * ) > ψ ( α 0 , γ 0 ) + η .
On the other hand, according to (A6), for arbitrary positive constants δ 1 and δ 2 , we have inf α α 0 δ 1 γ γ 0 δ 2 ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) p ψ ( α 0 , γ 0 ) . Thus, with probability tending to 1, the minimizer is taken inside of α α 0     δ 1 , γ γ 0 δ 2 for the strictly convex of ψ n ( α , γ ) . Therefore, the local minimizer inside α α 0     δ 1 , γ γ 0 δ 2 is the only global minimizer. According to the definition of α ^ n , γ ^ n , when n , P α ^ n α : α α 0 δ 1 1 , P γ ^ n γ : γ γ 0 δ 2 1 . Thus, as δ 1 0 , δ 2 0 , we can obtain the weak consistency of α ^ n and γ ^ n .
Next, we prove the asymptotic normality. Note that exp ( x ) + exp ( x ) 2 = x 2 + O ( | x | 3 ) . By invoking the Taylor expansion, we have
1 n E ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) = J 1 n i = 1 n E { exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i + exp ( Z i ( α α 0 ) U ^ i ( γ γ 0 ) + r i ) 2 } + 2 f ( 1 ) 1 n i = 1 n E { ( α α 0 ) Z i Z i ( α α 0 ) + ( α α 0 ) Z i U ^ i ( γ γ 0 ) + ( γ γ 0 ) U ^ i Z i ( α α 0 ) + ( γ γ 0 ) U ^ i U ^ i ( γ γ 0 ) } + O ( α α 0 3 ) + O ( γ γ 0 3 ) + O p ( n a + 2 b 1 a + 2 b ) = J ( α α 0 ) V 1 ( α α 0 ) + J ( α α 0 ) V 2 ( γ γ 0 ) + J ( γ γ 0 ) V 3 ( α α 0 ) + J ( γ γ 0 ) V 4 ( γ γ 0 ) + 2 f ( 1 ) ( α α 0 ) V 1 ( α α 0 ) + 2 f ( 1 ) ( α α 0 ) V 2 ( γ γ 0 ) + 2 f ( 1 ) ( γ γ 0 ) V 3 ( α α 0 ) + 2 f ( 1 ) ( γ γ 0 ) V 4 ( γ γ 0 ) + O ( α α 0 3 ) + O ( γ γ 0 3 ) + O p ( n a + 2 b 1 a + 2 b ) = J + 2 f ( 1 ) [ ( α α 0 ) V 1 ( α α 0 ) + ( α α 0 ) V 2 ( γ γ 0 ) + ( γ γ 0 ) V 3 ( α α 0 ) + ( γ γ 0 ) V 4 ( γ γ 0 ) ] + O ( α α 0 3 ) + O ( γ γ 0 3 ) + O p ( n a + 2 b 1 a + 2 b ) ,
where J = E ϵ sgn ( ϵ 1 ) > 0 , V 1 = E ( Z Z ) , V 2 = E ( Z U ^ ) , V 3 = E ( U ^ Z ) , V 4 = E ( U ^ U ^ ) .
Let W 1 = i = 1 n ( ϵ i + ϵ i 1 ) sgn ( ϵ i 1 ) Z i , W 2 = i = 1 n ( ϵ i + ϵ i 1 ) sgn ( ϵ i 1 ) U ^ i .
Next, we will show that, for every C 1 > 0 , C 2 > 0 , one has
sup θ θ 0 C 1 n 1 / 2 γ γ 0 C 2 n 2 b 1 2 ( a + 2 b ) | ψ n ( θ , γ ) ψ n ( θ 0 , γ 0 ) + W 1 ( θ θ 0 ) + W 2 ( γ γ 0 ) E ψ n ( θ , γ ) ψ n ( θ 0 , γ 0 ) | p 0 .
Let θ 1 = n ( α α 0 ) , θ 2 = δ n ( γ γ 0 ) . Then, Equation (A8) is rewritten as
sup θ 1 C 1 θ 2 C 2 | ψ n ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) ψ n ( α 0 , γ 0 ) + 1 n W 1 θ 1 + 1 δ n W 2 θ 2 E { ψ n ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) ψ n ( α 0 , γ 0 ) } | p 0 .
To prove the above equation, we will first prove that the following Equation (A9) holds for each fixed θ 1 and θ 2 , that is,
ψ n ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) ψ n ( α 0 , γ 0 ) + 1 n W 1 θ 1 + 1 δ n W 2 θ 2 E ψ n ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) ψ n ( α 0 , γ 0 ) p 0 .
Let
G i ( θ , γ ) ϵ i sgn ( ϵ i 1 ) × { exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i + exp Z i ( α α 0 ) U ^ i ( γ γ 0 ) + r i 2 } .
R i ( θ , γ ) I ϵ i > exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i I ( ϵ i > 1 ) × [ ϵ i exp ( U ^ i ( γ γ 0 ) Z i ( α α 0 ) + r i ) ϵ i 1 exp U ^ i ( γ γ 0 ) + Z i ( α α 0 ) r i ] .
Then, we have
ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) E ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) = i = 1 n ( ϵ i + ϵ i 1 ) sgn ( ϵ i 1 ) exp Z i ( α α 0 ) + U ^ i ( γ γ 0 ) r i 1 + i = 1 n G i ( α , γ ) E G i ( α , γ ) + 2 i = 1 n R i ( α , γ ) E R i ( α , γ ) .
For each fixed θ 1 and θ 2 , as n , one has
i = 1 n E G i ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) E G i ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) 2 i = 1 n E ϵ i sgn ( ϵ i 1 ) 2 × E { exp ( 1 n Z i θ 1 1 δ n U ^ i θ 2 + r i ) + exp ( 1 n Z i θ 1 + 1 δ n U ^ i θ 2 r i ) 2 } 2 = i = 1 n E ϵ i sgn ( ϵ i 1 ) 2 × E ( 1 n θ 1 Z i Z i θ 1 + 1 n δ n θ 1 Z i U ^ i θ 2 + 1 n δ n θ 2 U ^ i Z i θ 1 + 1 δ n θ 2 U ^ i U ^ i θ 2 + a i ) 2 + O p ( n a + 2 b 1 a + 2 b ) 0 ,
where a i such that P ( a i c δ n 3 / 2 ) = 1 for i = 1 , , n . As n , one has
i = 1 n G i ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) E [ G i ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) ] p 0 .
On the other hand, by the Taylor series expansion, for every fixed θ 1 and θ 2 , we have
E ϵ exp ( 1 n Z i θ 1 1 δ n U ^ i θ 2 r i ) ϵ 1 exp ( 1 n Z i θ 1 + 1 δ n U ^ i θ 2 + r i ) 2 = E ( ϵ 1 n Z i θ 1 ϵ 1 δ n U ^ i θ 2 ϵ 1 1 n Z i θ 1 ϵ 1 1 δ n U ^ i θ 2 + ϵ ϵ 1 + b ) 2 + O p ( n a + 2 b 1 a + 2 b ) = E { ( ϵ 1 ) 1 n Z i θ 1 ( ϵ 1 ) 1 δ n U ^ i θ 2 ( ϵ 1 1 ) 1 n Z i θ 1 ( ϵ 1 1 ) 1 δ n U ^ i θ 2 2 n Z i θ 1 2 δ n U ^ i θ 2 + ( ϵ 1 ) ( ϵ 1 1 ) + b } 2 + O p ( n a + 2 b 1 a + 2 b ) E { 9 ( ϵ 1 ) 2 + ( ϵ 1 1 ) 2 + 4 1 n θ 1 Z i Z i θ 1 + 9 ( ϵ 1 ) 2 + ( ϵ 1 1 ) 2 + 4 1 δ n θ 2 U ^ i U ^ i θ 2 + 9 ( ϵ 1 ) 2 + 9 ( ϵ 1 1 ) 2 + b 2 } ( 1 + o p ( 1 ) ) + O p ( n a + 2 b 1 a + 2 b ) ,
where b such that P ( b     c δ n 1 ) = 1 .
Similarly, we have
i = 1 n E R i ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) E R i ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) 2 i = 1 n E { { I ( 1 n Z i θ 1 + 1 δ n U ^ i θ 2 > 0 ) I ( 0 < log ϵ i 1 n Z i θ 1 + 1 δ n U ^ i θ 2 ) + I ( 1 n Z i θ 1 + 1 δ n U ^ i θ 2 0 ) I ( 0 log ϵ i > 1 n Z i θ 1 + 1 δ n U ^ i θ 2 ) } × [ 9 ( ϵ 1 ) 2 + ( ϵ 1 1 ) 2 + 4 1 n θ 1 Z i Z i θ 1 + 9 ( ( ϵ 1 ) 2 + ( ϵ 1 1 ) 2 + 4 ) 1 δ n θ 2 U ^ i U ^ i θ 2 + 9 ( ϵ 1 ) 2 + 9 ( ϵ 1 1 ) 2 + b i 2 ] } ( 1 + o p ( 1 ) ) + O p ( n a + 2 b 1 a + 2 b ) 0 ,
where b i such that P ( b i     c δ n 1 ) = 1 . Furthermore, for each fixed θ 1 and θ 2 , one has
i = 1 n R i ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) E R i ( α 0 + θ 1 n , γ 0 + θ 2 δ n ) p 0 .
Combining Equation (A11) with condition C8, we complete the proof of (A9). According to Lemma A.1 in [1], we know that ψ n ( θ 0 + α 1 n , γ 0 + α 2 δ n ) ψ n ( θ 0 , γ 0 ) + 1 n W 1 α 1 + 1 δ n W 2 α 2 is convex. Then, for each constant C 1 > 0 , C 2 > 0 , we have
sup θ 1 C 1 θ 2 C 2 | ψ n ( θ 0 + θ 1 n , γ 0 + θ 2 δ n ) ψ n ( θ 0 , γ 0 ) + 1 n W 1 θ 1 + 1 δ n W 2 θ 2 E ψ n ( θ 0 + θ 1 n , γ 0 + θ 2 δ n ) ψ n ( θ 0 , γ 0 ) | p 0 .
Lastly, let
ξ n ( α , γ ) ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) n J + 2 f ( 1 ) [ ( α α 0 ) V 1 ( α α 0 ) + ( α α 0 ) × V 2 ( γ γ 0 ) + ( γ γ 0 ) V 3 ( α α 0 ) + ( γ γ 0 ) V 4 ( γ γ 0 ) ] + W 1 ( α α 0 ) + W 2 ( γ γ 0 ) .
Then, for each C 1 > 0 , C 2 > 0 , as n , we have
sup α α 0 C 1 n 1 / 2 γ γ 0 C 2 δ n 1 / 2 | ξ n ( α , γ ) | p 0 .
Let α ^ n * , γ ^ n * be a minimizer of n J + 2 f ( 1 ) ( α α 0 ) V 1 ( α α 0 ) + ( α α 0 ) V 2 ( γ γ 0 ) + ( γ γ 0 ) V 3 ( α α 0 ) + ( γ γ 0 ) V 4 ( γ γ 0 ) W 1 ( α α 0 ) W 2 ( γ γ 0 ) . A simple calculation yields α ^ n * α 0 = 1 2 n J + 2 f ( 1 ) 1 V 1 V 2 V 4 1 V 3 1 W 1 W 2 V 2 V 4 1 . According to the definition of W 1 , W 2 , for each δ > 0 , there exists some constant C 3 , C 4 , and N δ , as n N δ . Therefore, one has P ( α ^ n * α 0   >   C 3 n 1 2 )     δ 2 , P ( γ ^ n * γ 0   >   C 4 δ n 1 2 ) δ 2 . According to (A11), for each η , there exists some constant N η , for any n N η , δ n N η , such that
P sup α α 0 C 3 n 1 / 2 γ γ 0 C 4 δ n 1 / 2 | ξ n ( α , γ ) | > η δ 2 .
Therefore, for each δ , η > 0 , there exists N = max N δ , N η , for any n > N , δ n > N , such that
P ( | ξ n ( α ^ n * , γ ^ n * ) | > η ) = P ( | ξ n ( α ^ n * , γ ^ n * ) | > η , α ^ n * α > C 3 n 1 2 , γ ^ n * γ > C 4 δ n 1 2 ) + P ( | ξ n ( α ^ n * , γ ^ n * ) | > η , α ^ n * α C 3 n 1 2 γ ^ n * γ C 4 δ n 1 2 ) P ( α ^ n * α > C 3 n 1 2 ) + P ( γ ^ n * γ > C 4 δ n 1 2 ) + P sup | α α 0 C 3 n 1 / 2 γ γ 0 C 4 δ n 1 / 2 | ξ n ( α , γ ) | > η δ .
This means ξ n ( α ^ n * , γ ^ n * ) = o p ( 1 ) . Similarly, for each constant C 1 > 0 , C 2 > 0 , one has
sup α α ^ n * C 1 n 1 / 2 γ γ ^ n * C 2 δ n 1 / 2 | ξ n ( α , γ ) | = o p ( 1 ) .
Let A 1 1 2 n ( J + 2 f ( 1 ) ) 1 [ V 1 V 2 V 4 1 V 3 ] [ W 1 W 2 V 2 V 4 1 ] ,   A 2 1 2 n ( J + 2 f ( 1 ) ) 1 , and W 2 V 4 1 V 4 1 V 2 A 1 .
A simple calculation yields A 1 J B = o p ( 1 ) . Note that
ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) = n J + 2 f ( 1 ) [ ( α α ^ n * ) V 1 ( α α ^ n * ) + ( α α ^ n * ) V 2 ( γ γ ^ n * ) + ( γ γ ^ n * ) V 3 ( α α ^ n * ) + ( γ γ ^ n * ) V 4 ( γ γ ^ n * ) ] W 1 A 1 W 2 A 2 + ξ n ( α , γ ) .
Then, for any constant C 1 , C 2 , C 5 , C 6 , such that 0 < C 5 < C 1 < , 0 < C 6 < C 2 < , one has
inf C 5 n 1 2 α α ^ n * C 1 n 1 2 C 6 δ n 1 2 γ γ ^ n * C 2 δ n 1 2 ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) inf C 5 n 1 2 α α ^ n * C 1 n 1 2 C 6 δ n 1 2 γ γ ^ n * C 2 δ n 1 2 { n J + 2 f ( 1 ) [ ( α α ^ n * ) V 1 ( α α ^ n * ) + ( α α ^ n * ) V 2 ( γ γ ^ n * ) + ( γ γ ^ n * ) V 3 ( α α ^ n * ) + ( γ γ ^ n * ) V 4 ( γ γ ^ n * ) ] W 1 A 1 W 2 A 2 } sup C 5 n 1 2 α α ^ n * C 1 n 1 2 C 6 δ n 1 2 γ γ ^ n * C 2 δ n 1 2 ξ n ( α , γ ) n J + 2 f ( 1 ) 1 n C 5 2 λ 1 + 1 n δ n C 5 C 6 λ 2 + 1 n δ n C 5 C 6 λ 3 + 1 δ n C 5 C 6 λ 4 W 1 A 1 W 2 A 2 + o p ( 1 ) ,
where λ 1 , λ 2 , λ 3 , and λ 4 are the smallest eigenvalues of V 1 , V 2 , V 3 , and V 4 , respectively.
On the other hand, for any constant C 7 and C 8 , one has
inf α α ^ n * C 7 n 1 2 γ γ ^ n * C 8 δ n 1 2 ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) ψ n ( α ^ n * , γ ^ n * ) ψ n ( α 0 , γ 0 ) = W 1 A 1 W 2 A 2 + ξ n ( α ^ n * , γ ^ n * ) = W 1 A 1 W 2 A 2 + o p ( 1 ) .
According to (A13) and (A14), we can obtain that, with probability tending to 1, the minimizer of ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) is taken inside of α α ^ n * C 7 n 1 2 , γ γ ^ n * C 8 δ n 1 2 . ψ n ( α , γ ) ψ n ( α 0 , γ 0 ) is convex. Therefore, the local minimizer is also the only global minimizer. Thus,
α ^ n α 0 = α ^ n * α 0 + o p ( n 1 2 ) = 1 2 n ( J + 2 f ( 1 ) ) 1 [ V 1 V 2 V 4 1 V 3 ] 1 [ W 1 W 2 V 2 V 4 1 ] + o p ( n 1 2 ) .
Then, as n , we have
n ( α ^ n α 0 ) D N 0 , 1 4 J + 2 f ( 1 ) 2 Σ 1 A ,
where A = E ( ϵ + ϵ 1 ) 2 , J = E ϵ sgn ( ϵ 1 ) . □
Proof of Theorem 3. 
Let δ n = n 2 b 1 2 ( a + 2 b ) , S n = δ n 1 ( α ˜ α 0 ) , V n = δ n 1 ( γ ˜ γ 0 ) , r i = 0 1 β 0 ( t ) X i ( t ) d t U ˜ i γ 0 , F n = ( S n , V n ) : ( S n , V n ) = L , where L is a large enough constant. Next, we show that, for any given η > 0 , there exists a sufficiently large constant L = L η , such that
P inf ( S n , V n ) F n Q n α 0 + δ n S n , γ 0 + δ n V n > Q n α 0 , γ 0 1 η .
This implies that with the probability of at least 1 η , there exists a local minimizer α ˜ and γ ˜ in the ball ( S n , V n ) : ( S n , V n ) = L such that α ˜ α 0 = O p ( δ n ) , γ ˜ γ 0 = O p ( δ n ) .
With a simple calculation, we have
ψ n ( S n , V n ) Q n ( α 0 + δ n S n , γ 0 + δ n V n ) Q n ( α 0 , γ 0 ) = i = 1 n Y i 1 exp Z i ( α 0 + δ n S n + U ˜ i ( γ 0 + δ n V n ) Y i 1 exp ( Z i α 0 + U ˜ i γ 0 ) } + i = 1 n Y i exp Z i ( α 0 + δ n S n + U ˜ i ( γ 0 + δ n V n ) Y i exp Z i α 0 + U ˜ i γ 0 } I 1 + I 2 .
Invoking the Taylor expansion, we have
I 1 = δ n i = 1 n { S n Z i Y i 1 exp Z i α 0 ) + V n U ˜ i Y i 1 exp U ˜ i γ 0 + 1 2 δ n 2 i = 1 n Y i 1 { S n Z i Z i S n exp ξ i [ 1 ] + V n U ˜ i U ˜ i V n exp ξ i [ 2 ] } I 11 + I 12 ,
where ξ i [ 1 ] is between Z i ( α 0 + δ n S n ) and Z i α 0 , and ξ i [ 2 ] is between U ˜ i ( γ 0 + δ n V n ) and U ˜ i γ 0 . For I 11 , we have
I 11 = i = 1 n δ n 1 ϵ i S n Z i + V n U ˜ i + i = 1 n 1 ϵ i δ n 2 S n Z i Z i S n + V n U ˜ i U ˜ V n I 111 + I 112 + o p 1 .
It is easy to obtain I 111 = o p 1 S n 2 + o p 1 V n 2 , I 112 = O p δ n 2 S n 2 + O p δ n 2 V n 2 , and I 12 = O p δ n 2 S n 2 + O p δ n 2 V n 2 .
Furthermore, we have
I 1 = O p δ n 2 S n 2 + O p δ n 2 V n 2 , I 2 = O p δ n 2 S n 2 + O p δ n 2 V n 2 .
Therefore, Equation (A15) holds, and there exists a local minimizer γ ˜ , such that γ ˜ γ 0 = O p δ n .
Note that
β ˜ ( t ) β 0 ( t ) 2 = j = 1 m γ ˜ j υ ^ j ( t ) j = 1 γ 0 j υ j ( t ) 2 2 j = 1 m γ ˜ j υ ^ j ( t ) j = 1 γ 0 j υ j ( t ) 2 + 2 j = m + 1 γ 0 j υ j ( t ) 2 4 j = 1 m ( γ ˜ j γ 0 j ) υ ^ j ( t ) 2 + 4 j = 1 m γ 0 j ( υ ^ j ( t ) υ j ( t ) ) 2 + 2 j = m + 1 γ 0 j 2 4 D 1 + 4 D 2 + 2 D 3 .
According to Equation (5), conditions C2, and the orthogonality of υ ˜ j and υ j ( t ) υ ^ j ( t ) 2 = O p ( n 1 j 2 ) , we have
D 1 = j = 1 m ( γ ˜ j γ 0 j ) υ ^ j ( t ) 2     j = 1 m | γ ˜ j γ 0 j | 2 = γ ˜ γ ˜ 0 2 = O p ( δ n 2 ) .
D 2 = j = 1 m γ 0 j ( υ ^ j ( t ) υ j ( t ) ) 2 m j = 1 m υ ^ j ( t ) υ j ( t ) 2 γ 0 j 2 m n O p ( j = 1 m j 2 γ 0 j 2 ) = O p ( n 1 m j = 1 m j 2 2 b ) = O p ( n 1 m ) = o p ( n 2 b 1 a + 2 b ) = o p ( δ n 2 ) .
D 3 = j = m + 1 γ 0 j 2 C j = m + 1 j 2 b = O ( n 2 b 1 a + 2 b ) = O ( δ n 2 ) .
Then, by combining the equalities (A16)–(A18), we complete the proof of Theorem 3. □
Proof of Theorem 4. 
According to Theorem 3, we know that, as n , with probability tending to 1, Q n ( α ˜ , γ ˜ ) achieves the minimal value at ( α ˜ , γ ˜ ) . We have the following score equations:
1 n i = 1 n Z i Y i exp ( Z i α ˜ U ˜ i γ ˜ ) + Z i Y i 1 exp ( Z i α ˜ + U ˜ i γ ˜ ) = 0 .
1 n i = 1 n U ˜ i Y i exp ( Z i α ˜ U ˜ i γ ˜ ) + U ˜ i Y i 1 exp ( Z i α ˜ + U ˜ i γ ˜ ) = 0 .
By Equations (A19) and (A20), we have
1 n i = 1 n { Z i ϵ i exp ( Z i ( α ˜ α 0 ) U ˜ i ( γ ˜ γ 0 ) + r i ) + Z i ϵ i 1 exp ( Z i ( α ˜ α 0 ) + U ˜ i ( γ ˜ γ 0 ) r i ) } = 0 .
1 n i = 1 n { U ˜ i ϵ i exp ( Z i ( α ˜ α 0 ) U ˜ i ( γ ˜ γ 0 ) + r i ) + U ˜ i ϵ i 1 exp ( Z ^ i ( α ˜ α 0 ) + U ˜ i ( γ ˜ γ 0 ) r i ) } = 0 .
Using the Taylor series expansion for Equation (A21), we have
1 n i = 1 n { U ˜ i ϵ i ( 1 Z i ( α ˜ α 0 ) U ˜ i ( γ ˜ γ 0 ) + r i ) + U ˜ i ϵ i 1 ( 1 + Z i ( α ˜ α 0 ) + U ˜ i ( γ ˜ γ 0 ) r i ) } + o p ( 1 ) = 0 .
Let V 1 = 1 n i = 1 n Z i Z i , V 2 = 1 n i = 1 n Z i U ˜ i , V 3 = 1 n i = 1 n U ˜ i Z i , V 4 = 1 n i = 1 n U ˜ i U ˜ i . A simple calculation yields
γ ˜ γ 0 = 1 n V 4 1 i = 1 n U ˜ i ( ϵ i 1 + ϵ i ) 1 ( ϵ i ϵ i 1 ) V 4 1 V 3 ( α ˜ α 0 ) + o p ( 1 ) .
Similarly,
α ˜ α 0 = 1 n V 1 1 i = 1 n Z i ( ϵ i 1 + ϵ i ) 1 ( ϵ i ϵ i 1 ) V 1 1 V 2 ( γ ˜ γ 0 ) + o p ( 1 ) .
Furthermore, we have
α ˜ α 0 = 1 n [ V 1 V 2 V 4 1 V 3 ] 1 i = 1 n ( ϵ i 1 + ϵ i ) 1 ( ϵ i ϵ i 1 ) [ Z i Z i U ˜ i ( U ˜ i U ˜ i ) 1 U ˜ i ] + o p ( 1 ) .
Let Z ˜ i = Z i Z i U ˜ i ( U ˜ i U ˜ i ) 1 U ˜ i . Then,
V 1 V 2 V 4 1 V 3 p Σ .
1 n i = 1 n ( ϵ i 1 + ϵ i ) 1 ( ϵ i ϵ i 1 ) Z ˜ i p N ( 0 , Σ 1 1 Σ Σ 2 ) ,
where Σ 1 = E ( ϵ + ϵ 1 ) 2 , Σ 2 = E ( ϵ ϵ 1 ) 2 .
According to the law of large numbers and the central limit theorem, as n , we obtain
n ( α ˜ α 0 ) D N ( 0 , Σ 1 1 Σ 1 Σ 2 ) .

References

  1. Chen, K.; Guo, S.; Lin, Y.; Ying, Z. Least absolute relative error estimation. J. Am. Stat. Assoc. 2010, 105, 1104–1112. [Google Scholar] [CrossRef] [PubMed]
  2. Fan, R.; Zhang, S.; Wu, Y. Penalized relative error estimation of functional multiplicative regression models with locally sparse properties. J. Korean Stat. Soc. 2022, 51, 666–691. [Google Scholar] [CrossRef]
  3. Shin, H. Partial functional linear regression. J. Stat. Plan. Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
  4. Liu, C.; Su, W.; Su, W. Efficient estimation for functional accelerated failure time model. arXiv 2024, arXiv:2402.05395. [Google Scholar]
  5. Chen, K.; Guo, S.; Lin, Y.; Ying, Z. Least product relative error estimation. J. Multivar. Anal. 2016, 144, 91–98. [Google Scholar] [CrossRef]
  6. Ming, H.; Liu, H.; Yang, H. Least product relative error estimation for identification in multiplicative additive models. J. Comput. Appl. Math. 2022, 404, 113886. [Google Scholar] [CrossRef]
  7. Ye, F.; Zhou, H.; Yang, Y. Asymptotic properties of relative error estimation for accelerated failure time model with divergent number of parameters. Stat. Its Interface 2024, 17, 107–125. [Google Scholar] [CrossRef]
  8. Zhang, J.; Feng, Z.; Peng, H. Estimation and hypothesis test for partial linear multiplicative models. Comput. Stat. Data Anal. 2018, 128, 87–103. [Google Scholar] [CrossRef]
  9. Zhang, T.; Zhang, Q.; Li, N. Least absolute relative error estimation for functional quadratic multiplicative model. Commun. Stat.-Theory Methods 2016, 45, 5802–5817. [Google Scholar] [CrossRef]
  10. Zhang, T.; Huang, Y.; Zhang, Q.; Ma, S.; Ahmed, S. Penalized relative error estimation of a partially functional linear multiplicative model. In Matrices, Statistics and Big Data: Selected Contributions from IWMS 2016; Springer: Cham, Switzerland, 2019; Volume 45, pp. 127–144. [Google Scholar]
  11. Cai, T.; Hall, P. Prediction in functional linear regression. Ann. Stat. 2006, 34, 2159–2179. [Google Scholar] [CrossRef]
  12. Hall, P.; Horowitz, J.L. Methodology and convergence rates for functional linear regression. Ann. Stat. 2007, 35, 70–91. [Google Scholar] [CrossRef]
  13. Yu, P.; Song, X.; Du, J. Composite expectile estimation in partial functional linear regression model. J. Multivar. Anal. 2024, 203, 105343. [Google Scholar] [CrossRef]
  14. Xia, X.; Liu, Z.; Yang, H. Regularized estimation for the least absolute relative error models with a diverging number of covariates. Comput. Stat. Data Anal. 2016, 96, 104–119. [Google Scholar] [CrossRef]
  15. Tang, Q.; Cheng, L. Partial functional linear quantile regression. Sci. China Math. 2019, 57, 2589–2608. [Google Scholar] [CrossRef]
  16. Febrero-Bande, M.; Fuente, M. Statistical Computing in Functional Data Analysis: The R Package fda.usc. J. Stat. Softw. 2012, 51, 1–28. [Google Scholar] [CrossRef]
  17. Knight, K. Limiting distributions for L1 regression estimators under general conditions. Ann. Stat. 1998, 26, 755–770. [Google Scholar] [CrossRef]
  18. Pollard, D. Asymptotics for Least Absolute Deviations Regression Estimators. Econom. Theory 1982, 7, 186–199. [Google Scholar] [CrossRef]
Figure 1. The estimated functional weight β ( t ) in model (7) with LPRE method.
Figure 1. The estimated functional weight β ( t ) in model (7) with LPRE method.
Mathematics 13 00471 g001
Table 1. The biases and standard deviations of the estimators for α 1 and α 2 , and the mean squared error of the estimators for α under different distributions (results × 100 ).
Table 1. The biases and standard deviations of the estimators for α 1 and α 2 , and the mean squared error of the estimators for α under different distributions (results × 100 ).
ErrorsnCriteriaLSLADLARELPRE
(i)150 α 1 /Bias (sd)0.033
(1.268)
0.025
(1.607)
−0.041
(1.575)
0.032
(1.267)
α 2 /Bias (sd)0.086
(1.648)
−0.051
(3.064)
−0.019
(1.995)
0.086
(1.649)
α /MSE0.0430.1200.0650.043
300 α 1 /Bias (sd)−0.002
(0.731)
−0.010
(0.919)
−0.028
(0.893)
−0.001
(0.731)
α 2 /Bias (sd)−0.031
(1.014)
−0.188
(1.954)
−0.047
(1.294)
−0.031
(1.015)
α /MSE0.0160.0470.0250.016
600 α 1 /Bias (sd)−0.016
(0.442)
−0.003
(0.578)
0.005
(0.575)
−0.016
(0.442)
α 2 /Bias (sd)−0.029
(0.666)
−0.047
(1.132)
−0.031
(0.799)
−0.029
(0.666)
α /MSE0.0060.0160.0100.006
(ii)150 α 1 /Bias (sd)−0.047
(9.856)
−0.354
(15.946)
−0.070
(9.078)
−0.018
(8.179)
α 2 /Bias (sd)−0.026
(12.612)
−1.135
(30.386)
−0.078
(11.615)
−0.021
(10.537)
α /MSE2.56211.7902.1731.779
300 α 1 /Bias (sd)−0.421
(6.757)
−0.614
(11.549)
−0.421
(6.255)
−0.289
(5.548)
α 2 /Bias (sd)0.119
(10.009)
−0.572
(23.006)
0.086
(9.076)
0.016
(8.245)
α /MSE1.4606.6341.2170.988
600 α 1 /Bias (sd)0.085
(4.684)
0.097
(8.125)
0.079
(4.311)
0.049
(3.874)
α 2 /Bias (sd)0.354
(6.517)
0.447
(15.891)
0.325
(5.973)
0.297
(5.382)
α /MSE0.6453.1880.5440.441
(iii)150 α 1 /Bias (sd)−0.014
(2.888)
0.030
(4.275)
−0.027
(4.109)
−0.011
(2.846)
α 2 /Bias (sd)0.247
(3.767)
0.085
(8.040)
3.503
(5.390)
0.039
(3.719)
α /MSE0.2260.8290.5820.219
300 α 1 /Bias (sd)−0.130
(1.956)
−0.235
(3.032)
−0.206
(2.949)
−0.126
(1.926)
α 2 /Bias (sd)0.202
(2.921)
−0.014
(6.183)
4.028
(4.058)
−0.017
(2.861)
α /MSE0.1230.4740.4140.119
600 α 1 /Bias (sd)0.041
(1.347)
0.032
(2.133)
0.051
(2.052)
0.041
(1.327)
α 2 /Bias (sd)0.336
(1.851)
0.185
(4.112)
4.186
(2.674)
0.114
(1.822)
α /MSE0.0540.2150.2890.051
Table 2. The RASEs of the estimators for β ( t ) under different error distributions (results × 100 ).
Table 2. The RASEs of the estimators for β ( t ) under different error distributions (results × 100 ).
nMethodsError (i)Error (ii)Error (iii)
150LS30.75234.81831.769
LAD30.78439.46232.136
LARE30.78134.42932.098
LPRE30.75233.92031.760
300LS22.08724.40722.021
LAD22.10428.31422.348
LARE22.10324.07722.297
LPRE22.08723.63722.013
600LS15.24516.88415.363
LAD15.25519.44615.567
LARE15.25416.64315.556
LPRE15.24516.39115.358
Table 3. MQEP of different random partitions.
Table 3. MQEP of different random partitions.
MethodsN = 100N = 200N = 400
LS0.9483.1102.815
LAD2.6543.2623.119
LARE1.1602.9493.248
LPRE0.8982.6961.995
PFLM2.5402.8202.907
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, X.; Yu, P.; Shi, J. Estimation for Partial Functional Multiplicative Regression Model. Mathematics 2025, 13, 471. https://doi.org/10.3390/math13030471

AMA Style

Liu X, Yu P, Shi J. Estimation for Partial Functional Multiplicative Regression Model. Mathematics. 2025; 13(3):471. https://doi.org/10.3390/math13030471

Chicago/Turabian Style

Liu, Xiaojing, Ping Yu, and Jianhong Shi. 2025. "Estimation for Partial Functional Multiplicative Regression Model" Mathematics 13, no. 3: 471. https://doi.org/10.3390/math13030471

APA Style

Liu, X., Yu, P., & Shi, J. (2025). Estimation for Partial Functional Multiplicative Regression Model. Mathematics, 13(3), 471. https://doi.org/10.3390/math13030471

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop