Pavement Distress Initiation Prediction by Time-Lag Analysis and Logistic Regression
Round 1
Reviewer 1 Report
Please see the attached word document.
In general, this is an interesting paper, it utilizes new image data sources to investigate the time lag effect of weather in the short time period for the initiation of distresses. However, some important citations for methods is omitted, while the discussion, the model method, the conclusion, and limitations should be better addressed.
Abstract
Line 18-25: it seems that the contents of line 20-25 are duplicated of line 18-20. To me, they are similar summari of the same results. Please revise and add some discussion for the (policy/transportation planning/urban planning) implication of the results.
1. Introduction
1.1 Background
Line 41-44: what is the timeframe of a long-term model and how frequently is the data of short-time model collected? Please add some contexts for clarification and the readability.
Same advice applied to Line 47-49, Line 55 and Line 67. You never clearly define what is long term (annual data?) and what is short term (daily data?). Please note that it may be clear for experts in pavement maintenance how long is long term and how timely is a short time. However, part of your readers will not be equipped with this kind of knowledge.
Line 73-74: add citations if any.
Line 75: How large?
Line 108-113: This is quite an interesting and important statement regarding the research gap. The authors can consider moving it to the beginning of the Introduction section.
2. Data Preparation
Line 152: Why the study period is not a full year? Jan 21 – Jun 30 period does undergo a transition from winter to summer, however, is a transition “from winter to summer” identical to “from summer to winter” regarding pavement surface’s health condition? I am afraid the authors did not provide adequate evidence to prove it.
2.1. Detection of the Pavement Distress Initiation
Line 171: what is the resolution of the image? And why this resolution? Does a larger image provide better/more accurate results?
Line 172-173, and lines 174-176:
Do you use a pre-trained model? How do you describe the performance of the pre-trained model? Are there any alternative models? Why do you choose this one particular?
Or is it your own model? If so, how accurate is it? What are the criteria to evaluate it? R2/MSRE/MSE, or absolute loss?
Line 176 and Figure 3: Please more sample images showing the segmentation results to convince your audiences that your pre-trained model is effective. Please also provide information on the accuracy rate.
2.2. Relevant Data Acquisition
Line 203: the title is vague and general, independent variables acquisition or other specific names would be more appropriate titles.
3. Methodology
Line 228: Missing appropriate citation, what studies have used TLCC?
Line 233: Missing appropriate citation
The Methodology section is more like a re-description of all existing and well-known models, however, why these models and methods are applied, and what are the novelties of using these models (do you use any data engineering process that needs to demonstrate, do you use large data set, do you modify the model, etc.) are missing.
Line 278-282: what is the difference between Equation 5 and Equation 6?
How do you apply your variables to equation 6? What program do you use to fit the model?
4. Results and Discussion
The authors are advised to move Table 1. Summary Statistics of Variables to the first part of the Result section. Usually, this Table is the first kind of analysis result you will have, and it’s more like providing a descriptive summary of the Data section, rather than a result.
As I understand, Model 2 is the baseline while Model 1 is the best performance model (with a time lag). Basically, they are the same thing except for Model 1 captures the time lag effect. Therefore, the writing of this paper, in the Abstract, and in the Introduction is confusing. When you say you use two models, readers might think that you test two types of model methods, or that the structure of the two models is fundamentally different. In your case, you are simply testing whether the time lag effect is significant and effective, therefore it is essentially ONE model. The discussion of the logit regression is not as developed as your correlation analysis results. However, it is an important part of your finding, please demonstrate it better.
5. Conclusion
Do you identify any future study areas to improve? Any limitations of your study data or method? How do you capture the impact of traffic or accident or type of vehicle on the road? For example, what if a segment just has more construction trucks that have stones and bricks accidentally fallen? There are many hidden/latent variables you omitted.
Comments for author File: Comments.pdf
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The presented manuscript mainly studies the pavement distress initiation prediction through merging the pavement distress data with weather and geometric factors. This concept can help improve and regulate the pavement maintenance.
Overall, it is an interesting topic, well written and structured paper.
Meanwhile, I would like the authors to denote the followings:
1. The abstract is good and well established.
2. The introduction started with good background knowledge, addressed the tackled scientific issue, and surveyed the related works from the literature. The surveyed literature is relevant and adequate.
3. Section 2 demonstrated how, where, and when the data was collected. I found this section good. However, I would strongly recommend the authors to improve the quality of Figure 1.
4. Section 3 depicted the suggested methodology and that is where I found that the presentation lacks a clear and direct explanation of the utilized method. My advice is to start with an introductory paragraph to this section where you can illustrate how the presented method is different from others in analyzing the collected data. Doing so, the authors would also have the opportunity to emphasis their contribution in engaging the weather and geometrical data.
- In the same section, please improve the quality of figure 1. The colors and textures in the four legends are not obvious in the graph.
- In line 233: Windowed Time-lag Cross-correlation (WTLCC) and in line 238: window time-lag cross-correlation (WTLCC). There are redundant abbreviation inclusions. Keep the first one only and no need to capitalize the first letter of each word.
- I don’t see any citations for the listed equations. An example is equation 3.
- In line 282, P(y) was already defined in line 279.
5. The results section has adequately presented the main results and achievements. I would only recommend some proofreading and figures improvements. For instance, the text in figure 6 is unreadable.
6. The conclusion structure is good and sufficient. However, it did not highlight the main achieved results as numbers, percentages, etc.
Regarding the presented method, in general, my main concerns are:
- The authors collected the data in China and developed their method to predict the pavement distress. How is this model going to be applicable in different areas ? For example, more dry areas with less rain or higher temperature?
- If other researchers collect data from such different areas where the environmental conditions are different from your studied environment, will it lead to same conclusions as you have already drawn?
- If your model does fit such diverse environments, how would you justify that. If not, what would you do in the future to make your model adaptable for such areas?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Thank you, the concerns are addressed by the authors
Reviewer 2 Report
The authors have adequately addressed all of my concerns. I see the manuscript now is much improved.