Next Article in Journal
An Analysis of PISA 2018 Mathematics Assessment for Asia-Pacific Countries Using Educational Data Mining
Next Article in Special Issue
X-STATIS: A Multivariate Approach to Characterize the Evolution of E-Participation, from a Global Perspective
Previous Article in Journal
Curvatures on Homogeneous Generalized Matsumoto Space
Previous Article in Special Issue
Statistical Depth for Text Data: An Application to the Classification of Healthcare Data
 
 
Article
Peer-Review Record

Scalar Variance and Scalar Correlation for Functional Data

Mathematics 2023, 11(6), 1317; https://doi.org/10.3390/math11061317
by Cristhian Leonardo Urbano-Leon 1,*,†, Manuel Escabias 1,†, Diana Paola Ovalle-Muñoz 1,† and Javier Olaya-Ochoa 2,†
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Mathematics 2023, 11(6), 1317; https://doi.org/10.3390/math11061317
Submission received: 3 February 2023 / Revised: 2 March 2023 / Accepted: 4 March 2023 / Published: 9 March 2023

Round 1

Reviewer 1 Report

In this study, Urbano-Leon et.al propose a new strategy to obtain scalar estimates of the summary statistics for functional data instead of functional estimates. Through simulation and application studies the authors demonstrate that their new approach was able to properly capture and characterize properties of functional summary statistics such as variance, covariance, and correlation.

The paper is nicely drafted with a very informative introduction section and the proposed approach is relevant given the increasing application of functional data analysis (FDA) in various scientific domains. I have a few questions and comments.

Comments:

1)     Does this new approach work only on an orthonormal basis? Can the authors please clarify this?

 

2)     The authors should provide their full codes for simulation and application studies so that their work and results are reproducible. For Ex: If all the analysis was performed in R then please provide all codes to reproduce all simulation and application study results, tables and figures. The supplied codes should also contain a readme file which clearly describes the relevant functions or sections of the code to aid someone in running them.

 

3)     Can the authors briefly clarify or show (in supplement) how their choice of dispersion values (Ex: 2.1 as high, 2.2 as moderate and 2.3 as low) was determined?

 

4)     Since small sample sizes is becoming a common problem nowadays, can the authors discuss or clarify the reliability of using their approach under small sample sizes? This is based on observations from Table 1 where the authors report very high mean absolute differences under small samples. Can the authors also provide additional metrics such as bias, standard error and MSE of their estimates as a table?

 

5)     Can the authors discuss how this new approach can aid in the functional regression setup?

Author Response

Your comments and suggestions are greatly appreciated. The following is a detailed response to each comment you made:

 

Comment 1: Does this new approach work only on an orthonormal basis? Can the authors please clarify this?

Response to comment 1: The most notable advantage of our proposal is found when an orthonormal basis is used. However, its use is possible in the case that the base is not normal or even in the case that the generator system is not a base. These results can be derived from equation 9, for which it is necessary to know the norm and the two-by-two inner product between all the functions that generate the finite-dimensional subspace.

 

Comment 2: The authors should provide their full codes for simulation and application studies so that their work  and results are reproducible. For Ex: If all the analysis was performed in R then please provide all  codes to reproduce all simulation and application study results, tables and figures. The supplied codes should also contain a readme file which clearly describes the relevant functions or sections of the code to aid someone in running them.

Response to comment 2: A supplement with the simulation code has been added.

 

Comment 3: Can the authors briefly clarify or show (in supplement) how their choice of dispersion value (Ex: 2.1 as high, 2.2 as moderate and 2.3 as low) was determined?

Response to comment 3: The concept of High, Moderate, and Low is determined based on the range of variability for each coefficient, in the absence of a specific frame of reference. A supplement with the simulation code has been added, which may help clarify this fact.

 

Comment 4: Since small sample sizes is becoming a common problem nowadays, can the authors discuss or clarify the reliability of using their approach under small sample sizes? This is based on observations from Table 1 where the authors report very high mean absolute differences under small samples. Can the authors  also provide additional metrics such as bias, standard error and MSE of their estimates as a table?

Response to comment 4: A determination of confidence remains a matter of investigation for us, which we hope to clarify soon. However, we have observed so far is that, as with scalar data, our approach's estimate of variance in a small sample may be less reliable because it may be more susceptible to random fluctuations. On the other hand, for adding additional metrics, unfortunately, this will take us a bit longer than initially given to reply to our reviewers.

 

Comment 5: Can the authors discuss how this new approach can aid in the functional regression setup?

Response to comment 5: Our approach could help in the functional regression setup since with the use of the proposed correlation it is possible to evaluate whether two functional covariates are highly correlated.

Reviewer 2 Report

The paper contributes an approach of redefinition of some common summary statistics in FDA. The new definitions are reasonable and useful, especially for quantifying the correlation between random functions. I believe that these new statistics play an important role in the development of FDA. The theoretical derivation and numerical results are worthy of trust.

Some questions are listed below:

(1) The existing researches have been able to deal with some very complex functional data, such as sparse, multidimensional, manifold and others. For these complex functional data, whether the statistics in this paper can be directly applied?

(2) The authors have summaried the advantages of the approach. I suggest that they should list some limitations of the approach. For example, the loss of data information, etc.

Author Response

Your questions are greatly appreciated. The following is a detailed response to each question you made.

 

Question 1: The existing researches have been able to deal with some very complex functional data, such as sparse,  multidimensional, manifold and others. For these complex functional data, whether the statistics in this  paper can be directly applied?

Response to question 1: If the complex functional data is represented by a basis of a finite-dimensional subspace of the L2 space, it is possible to directly apply the proposed statistics, giving a greater operational advantage if the chosen basis is an orthonormal basis.

 

Question 2: The authors have summaried the advantages of the approach. I suggest that they should list some limitations of the approach. For example, the loss of data information, etc.

Response to question 2: We have added a paragraph in the conclusion section dedicated to possible limitations of our approach.

Reviewer 3 Report

Review attached

Comments for author File: Comments.pdf

Author Response

We appreciate your suggestions and comments, you will find a detailed list of the actions taken and the responses to your comments.

 

Comment 1: Bibliographic references should not be inserted in the abstract.

Response to comment 1: This suggestion has been addressed, the references have been removed from the abstract.

 

Comment 2: In the proof of theorem 1, the statement "we also have that B has an orthonormal basis" should be replaced with "B is an orthonormal basis".

Response to comment 2: Your suggestion has been implemented.

 

Comment 3: In section 3, a spelling correction should be made: “which builds a variety of functional data sets and we represent, it in the  …”   which builds a variety of functional data sets, and we represent it in the …”.

Response to comment 3: The correction has been made.

 

Comment 4: In section 4.2, in the paragraph preceding Figure 6, "X-axis has been sent to [0, 1]" should be replaced by " X-axis has been set to [0, 1]"

Response to comment 4: The correction has been made.

 

Comment 5: Web links shorturl.at/BDRSY and shorturl.at/acoy6 in the Declarations section do not work.

Response to comment 5: The observation has been attended, the web links have been updated.

 

Comment 6: More details should be specified about the construction of the functions used in the simulations in section 3, for reproducibility and for a better exemplification of the representation vectors.

Response to comment 6: To clarify the simulation process, a supplement containing the code used has been added.

 

Comment 7: The charts made using functional descriptive analysis from [1] allow various analyses of the time series, while the proposed method quantifies the result in a numerical calculation, by which the values of some scalar statistical measures are obtained. It should be explained why the charts in figure 5 were used in the functional cross-correlation approach from [1], if no correlation could be highlighted based on them between the functional data from the considered areas.

Response to comment 7: The charts in figure 5 were used to illustrate the difference between the existing approach and our approach, precisely because in the approach of [1] it is not possible to determine any general correlation between pairs of functional variables, while in our approach it is possible, as illustrated in Table 2.

 

Comment 8: Considerations regarding the limitations of the proposed method should be added to the article.

Response to comment 8: We have added a paragraph in the conclusion section dedicated to possible limitations of our approach.

Back to TopTop