Next Article in Journal
A Stochastic Approach for Life-Cycle Cost Analysis of Railway Turnouts Exposed to Climate Uncertainties
Previous Article in Journal
Guidelines to Support Graphical User Interface Design for Children with Autism Spectrum Disorder: An Interdisciplinary Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Extended Abstract

Sparse Semi-Functional Partial Linear Single-Index Regression †

1
MODES Research Group, CITIC, Universidade da Coruña, 15071 A Coruña, Spain
2
Institut de Mathématiques, Université Paul Sabatier, 31062 Toulouse, France
*
Author to whom correspondence should be addressed.
Presented at the XoveTIC Congress. A Coruña, Spain, 27--28 September 2018.
Proceedings 2018, 2(18), 1190; https://doi.org/10.3390/proceedings2181190
Published: 17 September 2018
(This article belongs to the Proceedings of XoveTIC Congress 2018)

Abstract

:
The variable selection problem is studied in the sparse semi-functional partial linear model, with single-index type influence of the functional covariate in the response. The penalized least squares procedure is employed for this task. Some properties of the resultant estimators are derived: the existence (and rate of convergence) of a consistent estimator for the parameters in the linear part and an oracle property for the variable selection method. Finally, a real data application illustrates the good performance of our procedure.

1. Introduction

In many real problems, to predict the value of a random variable, observations of many other variables are available. However, in many cases, it is unknown which of them (very few) have a real influence in the response. In this practical framework, we need procedures able to select the relevant variables to avoid high-dimensionality problems. Reducing the complexity of the model becomes even more crucial when regression involves a functional variable too (data are functions, images...). Therefore, the main goal is the simplification of the model, which makes easier both its estimation and interpretation, without losing its predictive efficiency.
These practical problems have motived the peak of semiparametric models in the functional regression, together with the variable selection procedures. In [1] the penalized least squares method for estimation and variable selection is studied for the partial linear model with functional covariate. In this model, the real variables have a linear effect (involving interpretable coefficients that are the parameters) in the response, while the infinite-dimensional covariate has a nonlinear (nonparametric) influence. However, in real data applications, it would be interesting having parameters related to the functional variable to derive practical interpretations. This is one of the advantages of the semi-functional partial linear single-index model (SFPLSIM): the real covariates also affect in a linear way to the response, but the infinite-dimensional covariate influences it trough a projection in an unknown direction, after applying a nonlinear link function. This direction of projection behaves like a function-parameter that could have interesting interpretations. Some theoretical properties related to the nonparametric estimation of the functional single-index model are given in [2]. In this paper, we will study the sparse SFPLSIM, focusing in the variable selection problem. For this purpose, we will use the penalized least squares procedure for estimating the parameters of the lineal components and, simultaneously, selecting the relevant covariates. The properties of the estimators will be analysed from a theoretical point of view: we will set its convergence rates and the consistency for selecting the model. These results will be illustrated through a real data application.

2. The Model

The SFPLSIM is defined by the relationship
Y i = X i 1 β 01 + + X i p n β 0 p n + m θ 0 , X i + ε i , i = 1 , , n ,
where Y i denotes a scalar response, X i 1 , , X i p n are random covariates taking values in R and X i is a functional random covariate valued in a separable Hilbert space H with inner product · , · . β 0 = ( β 01 , , β 0 p n ) R p n , θ 0 H and m ( · ) are a vector of unknown real parameters, an unknown functional direction and an unknown smooth real-valued function, respectively. Finally, ε i is the random error, which verifies E ε i | X i 1 , , X i p n , X i = 0 .

3. The Penalized Least-Squares Estimators

For the purpose of simultaneously estimating β -parameters and selecting relevant X-covariates in the SFPLSIM (1), we will apply the penalized least-squares approach. For that, in a first step we transform the SFPLSIM in a linear model by extracting from Y i and X i j ( j = 1 , , p n ) the effect of the functional covariate X i when is projected on the direction θ 0 . Specifically, denoting by X i = X i 1 , X i 2 , , X i p n , X = X 1 , , X n and Y = Y 1 , , Y n , the fact that
Y i E Y i | θ 0 , X i = X i E X i | θ 0 , X i β 0 + ε i , i = 1 , , n ,
allows to consider the following approximate linear model (see Appendix A for understanding the notation):
Y ˜ θ 0 X ˜ θ 0 β 0 + ε ,
where ε = ε 1 , , ε n . Then, in a second step, the penalized least-squares approach is applied to model (3). Specifically, β 0 and θ 0 are estimated by considering a minimizer, ( β ^ 0 , θ ^ 0 ) , of the penalized profile least-squares function
Q β , θ = 1 2 Y ˜ θ X ˜ θ β Y ˜ θ X ˜ θ β + n j = 1 p n P λ j n | β j | ,
where β = ( β 1 , , β p n ) , P λ j n · is a penalty function and λ j n > 0 is a tuning parameter. Note that, simultaneously to the parameter estimation, the previous procedure can be considered as a variable selection method: if β ^ 0 j is a non-null component of β ^ 0 , then X j is selected as an influential variable.
From now on, we will denote J n = { 1 , , p n } and S n J n such that β 0 j 0 for j S n and β 0 j = 0 for j S n c = J n / S n . In addition s n will mean card ( S n ) and we will assume that S n = { 1 , , s n } .

4. Asymptotic Theory

In this paper, the existence of the penalized estimator is established as well as the corresponding rates of convergence. In particular, under some assumptions, we proved that there exists a local minimizer β ^ 0 , θ ^ 0 of Q β , θ such that
β ^ 0 β 0 = O p s n n 1 / 2 + δ n where δ n = max j S n P λ j n | β 0 j | .
Furthermore, the selected set of variables, S ^ n = { j J n ; β ^ 0 j 0 } , works as well (at least asymptotically) as it would do if the true set of relevant variables S n was known. Specifically, P ( S ^ n = S n ) 1 as n .
An application to real data is included, which shows the good performance of the presented method in terms of error of prediction.

Funding

The authors acknowledge partial support by MINECO grants MTM2014-52876-R and MTM2017-82724-R (EU ERDF support included). Additionally, financial support from the Xunta de Galicia (Centro Singular de Investigación de Galicia accreditation ED431G/01 2016-2019 and Grupos de Referencia Competitiva ED431C2016-015) and the European Union (European Regional Development Fund - ERDF), is gratefully acknowledged. The first author also thanks the financial support from the Xunta de Galicia and the European Union (European Social Fund - ESF), the reference of which is ED481A-2018/191.

Conflicts of Interest

The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
SFPLSIMSemi-functional partial linear single index model

Appendix A. Notation

For any ( n × q ) -matrix A ( q 1 ) , if I is the ( n × n ) -identity-matrix, we denote
A ˜ θ = I W h , θ A , w h e r e W h , θ = w n , h , θ ( X i , X j ) i , j ,
with w n , h , θ ( · , · ) being the weight function
w n , h , θ ( χ , X i ) = K d θ χ , X i / h j = 1 n K d θ χ , X j / h ,
where K : R + R + is a kernel function, h > 0 is a smoothing parameter and, for θ H , d θ ( · , · ) is the semimetric defined as
d θ χ , χ = θ , χ χ , χ , χ H .

References

  1. Aneiros, G.; Ferraty, F.; Vieu, P. Variable selection in partial linear regression with functional covariate. Statistics 2015, 49, 1322–1347. [Google Scholar] [CrossRef]
  2. Novo, S.; Aneiros, G.; Vieu, P. Automatic and location-adaptive estimation in functional single-index regression. 2018; in press. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Novo, S.; Aneiros, G.; Vieu, P. Sparse Semi-Functional Partial Linear Single-Index Regression. Proceedings 2018, 2, 1190. https://doi.org/10.3390/proceedings2181190

AMA Style

Novo S, Aneiros G, Vieu P. Sparse Semi-Functional Partial Linear Single-Index Regression. Proceedings. 2018; 2(18):1190. https://doi.org/10.3390/proceedings2181190

Chicago/Turabian Style

Novo, Silvia, Germán Aneiros, and Philippe Vieu. 2018. "Sparse Semi-Functional Partial Linear Single-Index Regression" Proceedings 2, no. 18: 1190. https://doi.org/10.3390/proceedings2181190

APA Style

Novo, S., Aneiros, G., & Vieu, P. (2018). Sparse Semi-Functional Partial Linear Single-Index Regression. Proceedings, 2(18), 1190. https://doi.org/10.3390/proceedings2181190

Article Metrics

Back to TopTop