Pointwise Nonparametric Estimation of Odds Ratio Curves with R: Introducing the flexOR Package
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis manuscript introduces flexOR, an R package designed for estimating spline-based odds ratio curves with a specified covariate reference point. This capability enhances understanding of the impact of continuous covariates on outcomes. The package facilitates computation of degrees of freedom, odds ratios, and confidence intervals, as well as prediction and visualization. The manuscript begins with an overview of common methods for modeling nonlinear relationships and their challenges. It then delves into the additive model and provides a detailed description of the software. Two case studies demonstrate practical applications. flexOR is built on smoothing splines and utilizes a novel approach for approximating the covariance matrix of the log odds ratio to construct confidence intervals.
Here are some comments.
1. It would be beneficial to provide more detailed information about the novel method used to approximate the covariance matrix of the log odds ratio.
2. We can actually compute point-wise confidence intervals for the effect of the continuous variable on the outcome using the results obtained from the mgcv package. Subsequently, we can utilize this information to generate odds ratio curves. Have you compared this approach to the results obtained from flexOR? What are the advantages of using flexOR?
3. Can the method be extended to handle bivariate splines? In other words, can it accommodate the smooth effects of two variables analyzed jointly?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsPlease see the comments attached.
Comments for author File: Comments.pdf
Comments on the Quality of English LanguageThe writing is fine. I don't have further comments on the English Language.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript describes the flexOR package that provides a comprehensive framework for pointwise nonparametric estimation of odds ratio curves for continuous predictors in logistic regression models. The package offers various options for automatically choosing the degrees of freedom in multivariable models and includes visualization functions to aid in the interpretation and presentation of the estimated odds ratio curves. The manuscript provides a general description of the package, while a few points required further clarification.
1. When estimating the parameters for smoothing splines, how did the flexor packages select the criteria among AIC, BIC, REML and GCV.Cp? The authors mentioned that the selection will be made by another R packages mgcv or dfgam, and might be case by case when various datasets were given. The authors might need to provide more illustrations or details to demonstrate why and when different criteria were used.
2. Following the above questions, the authors might consider make a comprehensive comparison across different criteria used when fitting data. This will help users to understand the usage of the package in their data and will thus enhance the rationale behind and the capacity of the study.
3. When presenting the examples, the authors should provide more details of the abbreviations for the options used in the flexOR packages. For example, did the “s” mean smoothing or something else, and what will the other options that are available for a “formula”?
4. Will more visualization options (e.g. plot types) to be included in the packages to better interpret the smoothing results?
5. In these provided examples, how did the authors determine the performance of the smoothing results? Will it be better than other methods or packages?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for Authors1. In section 2, ‘Additive Model’, the authors describe mathematical formulas for logistic additive models where the log-odds of an event are modeled as a combination of linear terms and transformed covariates, and for the adjusted odds ratio. However, it is a well-known approach and authors don’t seem to offer anything new in terms of the mathematics of it. Furthermore, two main functions in the package ‘flexOR’, flexOR and dfgam are essentially wrappers on existing functions gam from package ‘mgcv’ and gam from package ‘gam’ respectively. It doesn’t appear that authors offer any new solutions but instead focus on modifying existing packages. The inclusion of mathematical formulas, therefore, is unclear.
2. In the first R script, only the “flexOR” library is included but subsequently ‘gam’ function is used. The reader can only conclude that it is gam function from ‘mgcv’ package by scrutinizing the function arguments. The authors need to explicitly specify this in their paper.
3. In the same script, for demonstration purposes, the authors seem to use a dataset that is not available in any of the R libraries. The idea of showing a script is to let readers try to reproduce the steps and see how the results are derived in their own machines. It is impossible with the “heart2” dataset. Alternatively, if it is a dataset within an R package, the authors need to clarify how to access it.
4. In addition to point #3, it is not possible to follow the steps in this script. The input dataset for gam function is called ‘heart2’, while further down, for flexOR the authors use ‘heart’. Is it a typo? Or some step wasn’t shown?
5. The part with ‘plotly’ just adds an interactive effect to the existing plot. Otherwise, the plots look identical. It has nothing to do with the proposed functional implementations.
6. The second script, although it showcased an accessible dataset, is also irreproducible. The degrees of freedom for both age and mass are 1.1 which is different to what authors claim in the manuscript (3.3, 4.1). Installing the package directly from the github repo doesn’t solve the issue. SessionInfo is included for authors’ reference.
7. The authors must do a better job on making sure that the package will run on different setups such as different operational systems, otherwise explicitly specify the OS that is preferrable. While there were no visible problems with installing it in Windows 10, installing it in ubuntu 18.04 failed because of the gam dependency. Additional steps were required there. Similarly, following #6, although in the package manual authors stated R (>= 3.1.0) as a requirement, the attempt to reproduce their example from the manuscript failed in R 4.3.1.
8. Th rest is not possible to verify, the authors must double check their scripts in the manuscript to make sure that they are fully reproducible.
session info is as follows.
remotes::install_github(
repo="martaaaa/flexOR",
build=TRUE,
build_manual=TRUE
)
library(flexOR)
library(mlbench)
data(PimaIndiansDiabetes2)
df2 <- dfgam(response="diabetes",
nl.predictors=c("age","mass"),
other.predictors=c("pedigree"),
smoother="s",
method="AIC",
data = PimaIndiansDiabetes2)
df2$df
The output:
df
age 1.1
mass 1.1
SessionInfo:
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
time zone:
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mlbench_2.1-3.1 flexOR_0.9.6
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript has been thoroughly revised to address the feedback received, significantly enhancing its quality. The addition of a detailed description of the novel approximation of the covariance matrix of the log odds ratio contributes to a more comprehensive understanding of the methodology. Furthermore, the explicit explanation of the advantages offered by the flexOR package strengthens the manuscript's value proposition. Additionally, the description of potential extensions to incorporate bivariate splines for modeling bivariate intersections adds further depth to the research. Overall, these improvements have resulted in a manuscript that is now more robust and complete.
Author Response
Thank you for your thorough review and positive feedback!
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have addressed all of the issues discussed in the previous round
Nonetheless, I would like to proffer a few comments for their consideration:
1. \beta_1 in line 185 appears distinct from \beta_1 in Equation (4), suggesting that an alternative symbol may enhance clarity.
2. It might be better to properly describe how the spline method works.
Author Response
The authors have addressed all of the issues discussed in the previous round
Nonetheless, I would like to proffer a few comments for their consideration:
- \beta_1 in line 185 appears distinct from \beta_1 in Equation (4), suggesting that an alternative symbol may enhance clarity.
Thank you for your valuable feedback. We appreciate your keen observation regarding the use of symbols in the manuscript. This has been corrected.
- It might be better to properly describe how the spline method works.
Thank you for your suggestion. We have revised the manuscript to include a brief description of the spline method. References has been included also.
Reviewer 4 Report
Comments and Suggestions for AuthorsNo further comments
Author Response
Thank you for your thorough review and positive feedback!