1. Introduction and Literature Review
Road infrastructures are vital to socioeconomic growth, enabling efficient transportation, stimulating trade, creating jobs, improving accessibility, and attracting investment [
1,
2,
3,
4,
5]. Furthermore, well-maintained roads are crucial to ensuring the safety, efficiency, and longevity of transport infrastructure [
5,
6]. Some studies have shown that the inadequate maintenance of pavements results in higher vehicle operating costs, more frequent accidents, substantially increased air pollution, and the reduced reliability of the overall transportation network [
3,
7,
8]. To determine the maintenance required, it is necessary to know the type, severity, and extent of distress in the pavement surface, as well as its structural and roughness condition [
9]. Two of the most commonly used indicators to assess the condition of pavements are the international roughness index (IRI) and the pavement condition index (PCI) [
10,
11].
The IRI is a standardized measurement used to quantify the roughness of a road surface, typically expressed in units of inches per mile or meters per kilometer. Its value is determined by measuring the vertical deviations of the road surface from a straight line over a specific distance [
12]. These measurements are taken using specialized equipment, such as a profilometer, which records variations in elevation as it travels along the road. The IRI reflects the overall quality of a road; a lower IRI indicates a smoother road surface [
13]. This information is valuable for assessing the condition of road infrastructure, prioritizing maintenance activities, and planning road improvement projects to enhance the comfort and safety of road users [
14]. However, the IRI has several drawbacks. Its measurement requires specialized equipment with a high cost of acquisition and use [
15], restricting the frequency at which measurements can feasibly be performed. Additionally, the IRI does not enable the identification of the type of pavement deterioration, which is essential to determine appropriate maintenance treatments [
9].
To overcome the limitations of the IRI, additional visual inspections are required, which form the basis of the PCI, a numerical classification system used to evaluate and quantify the pavement condition based on existing pavement distresses. It provides a standardized and objective measure of the quality and levels of deterioration observed in the pavement, allowing its condition to be assessed at both the structural and operational levels [
16]. This index is calculated via the identification and quantification (of both the severity and extent) of 19 types of pavement deterioration, resulting in a numerical value on a scale from 0 to 100, with a higher value indicating a better pavement condition. ASTM International [
17] establishes levels of pavement condition according to ranges of PCI values. The PCI is a valuable tool for making informed decisions about maintenance, rehabilitation, and resource prioritization in pavement management systems [
18]. It is widely used by transportation agencies, municipalities, and engineering firms to evaluate and manage pavement assets systematically [
19].
In addition, the PCI methodology is simple to implement as it does not require specialized tools. Moreover, in the last decade, researchers have adopted artificial intelligence image processing techniques to evaluate pavement distress and assess PCI, avoiding the disadvantages of visual inspection, such as traffic disruptions and loss of time [
20]. The most advanced techniques are based on deep learning [
5], such as convolutional neural networks (CNNs) [
21,
22,
23,
24], among others. These methodologies allow for the efficient identification, classification, and quantification of pavement distress [
21,
25,
26,
27]. Therefore, the application of artificial intelligence techniques to measure the PCI can offer cheaper and more frequent data than is possible with the IRI.
Although the IRI and PCI measure different aspects of pavement conditions, both indices provide valuable information on the quality and performance of road surfaces. The PCI quantifies the deterioration observed in pavements due to distress, whereas the IRI is crucial for managing maintenance operations as it is closely related to key prioritization criteria, such as user comfort, costs, and road safety [
3,
28,
29,
30]. Understanding the relationship between the IRI and PCI would allow for a more complete and holistic assessment of pavement conditions and enable the identification of patterns and trends in their deterioration. This would facilitate informed decision-making, the optimization of resources, and the implementation of effective strategies for road infrastructure maintenance and management [
31]. Thus, it would be useful to develop a method for estimating the IRI from the PCI or observed pavement distresses. This would allow IRI data to be obtained more frequently without incurring the significant costs associated with direct measurements.
The relationship between the IRI and PCI has been studied extensively, yet significant challenges remain. Many prior studies have focused on predicting PCI values from IRI data [
9,
14,
32], some using large datasets such as the Long-Term Pavement Performance (LTPP) database. However, these models frequently exhibit poor predictive accuracy due to high variability in deterioration conditions. For instance, as highlighted by Piryonesi and El-Diraby [
10], roads with a perfect PCI score (PCI = 100) can display markedly different IRI values, influenced by factors such as slope, pavement type, and construction quality. To overcome this limitation, other studies have presented models for specific geographical and climatic contexts [
6,
10]. These studies have achieved high predictive accuracy within specific contexts but lack generalizability across diverse regions. Hence, a review of the literature suggests that, although there is a relationship between the IRI and PCI, it remains challenging to establish an accurate and universal model that works adequately across different climatic and geographical contexts.
Additionally, several studies have explored the influence of pavement distresses on roughness. For instance, Aultman-Hall et al. [
33] used neural networks to examine the correlation between the IRI and two types of pavement distresses, specifically rutting and cracking, using data from the Connecticut Department of Transportation. Kirbaş [
34] performed a regression analysis to study the effect of typical pavement distresses found in Turkish highways, finding that increases in the IRI are most often caused by alligator cracking, depressions, and patch-type deterioration. Amarendra and Ashoke [
35] used multiple linear regression analysis to study the relationship between the IRI and various pavement distresses in an Indian dataset, observing that different types of distresses have distinct effects on roughness. In Indian roads, potholes and raveling were found to be predominant. Although it is well-established in the literature that the type of distress affects pavement roughness, there is no consensus on how pavement distress impacts the IRI across all contexts.
Despite these contributions, existing studies highlight the complexity of accurately modeling the IRI–PCI relationship across varying distress types and external conditions. As pavement distress intensifies, the relationship between the IRI and PCI becomes less predictable, with some forms of distress exerting minimal impact on roughness [
6]. Moreover, while climate and traffic are known to significantly influence pavement deterioration, few studies have comprehensively integrated these factors, and no accurate, universal model has yet been developed to perform consistently across diverse climatic and geographical contexts. This gap underscores the need for comprehensive models that consider multiple distress types, climatic regions, and traffic levels, in order to make more robust and generalizable predictions. This study aims to address the aforementioned gap by developing a dual-model approach that integrates multiple distress types, climatic regions, and traffic levels, enabling more accurate and generalizable predictions of pavement performance across diverse environmental and operational conditions.
This study aims to address the aforementioned gap by developing a dual-model approach involving three integrated phases: (1) Applying an initial linear regression model to predict the IRI from the PCI, which serves as a baseline for an iterative process that divides the data into groups according to the percentage difference between the predicted and observed values of the initial model; (2) Developing a classification model using multinomial logistic regression to accurately assign each road section to its respective group, considering road conditions such as pavement distress, climate, and traffic; and (3) Predicting the IRI from the PCI using tailored regression models for each classification group. The proposed dual model for predicting the IRI is composed of the logistic regression model (result of step 2) for classification and the linear regression model (result of step 3) for prediction. This comprehensive approach should facilitate more efficient and sustainable pavement maintenance management by providing accessible and affordable data for a wide geographical area covering different traffic and climatic conditions.
2. Research Method
This study’s research methodology is outlined in
Figure 1. First, the data were collected from the LTPP database, which was selected because it includes information for different climatic and traffic contexts [
10,
36]. Then, the data were prepared by analyzing and removing anomalous cases. The dataset was then divided into a 70–30 split for training and validation. Next, the dual model was constructed through an iterative process. First, an initial IRI–PCI model was identified for all data. Second, the difference between the predicted and observed values was used to obtain a classification model to split the data into meaningful groups. Third, a predictive model was generated for each group to enable accurate IRI prediction based on the PCI. This iterative procedure continued until the predictive models achieved the best
R2 value. Finally, a comprehensive validation of both the classification and predictive models was conducted.
2.1. Data Collection and Preparation
The data collection process focused on gathering information about the IRI, pavement distresses, traffic, and climatic conditions from the LTPP database. Several studies have shown that there is a significant relationship between the IRI and PCI [
6,
9,
10,
14,
32]. Thus, the mean IRI value evaluated along the trajectories of both wheels was directly downloaded. Additionally, pavement distresses have been highlighted as a key element in predicting the IRI, as not all types of pavement distress affect the IRI in the same way [
33,
34,
35]. For this reason, pavement distress data were also downloaded from the database. Some of these types of pavement distresses can be differentiated into three severity levels: low, medium, and high. These variables (pavement distresses by severity) were used to calculate the PCI according to the method recommended by ASTM International [
17]. In addition, climate and traffic were considered as they have proven to be of significant importance in IRI prediction [
37]. Hence, variables such as the accumulated number of vehicles (AADT_CUM), the accumulated number of heavy vehicles (AADTT_CUM), and the accumulated equivalent single-axle load (ESAL) in thousands (KESAL_CUM) were calculated using the daily traffic information from the opening date or from the last maintenance treatment and data collection [
37,
38,
39]. Finally, temperature (ANN_TEMP), accumulated precipitation (PRECIP_CUM), and snowfall (SNOWFALL_CUM) have been used in previous studies to predict the IRI [
40,
41]. These were obtained in the same way as for traffic but using precipitation data.
Table 1 lists all variables used.
After data collection, the PCI records were correlated with the acquired IRI data to determine their relationship. The following criteria were applied when merging the IRI and PCI data: the difference in time between the collection of the IRI and PCI data must not be greater than one year, and no maintenance work must have taken place during that period. Then, outliers were deleted from the database, such as road sections for which the PCI increased over time. After this process, the data were segmented into two sets: 70% of the data was randomly selected to develop the model, while the remaining 30% was reserved for model validation.
2.2. Dual Model Development
Once the sample was segmented, the training data were used to develop the dual model (
Figure 2, where the arrows correspond to the steps followed in the dual model development process). First, a linear regression was applied to predict the IRI from the PCI, as the only independent variable. Linear regression is a statistical method employed to model the relationship between a dependent variable and one or more independent variables. Essentially, it aims to establish the best-fitting linear relationship to the observed data, aiding in the comprehension of the trend and magnitude of the association between variables [
42]. Then, the data were classified into groups to characterize the phenomenon in a more complete and accurate way. An iterative process was defined to establish the optimal data classification, enabling the development of predictive models with the best possible fit. This process is detailed below:
The data were divided into three groups to characterize the wide variability of the IRI according to the percentage difference between the predicted and observed values of the predictive model. This allowed for distinguishing data that were fitted poorly by the model. This process led to three groups: (1) the Middle group, formed by data whose prediction error was less than a threshold p, (2) the Upper group, consisting of values that were underpredicted by at least p, and (3) the Lower group, whose values were overpredicted by at least p. The parameter p was changed in each iteration, taking values from 1 to 30, increasing by one each time.
Once the groups were defined, the classification model was developed using multinomial logistic regression to accurately assign each road section under study to its respective group.
After defining the groups and the classification model, a prediction model was developed using linear regression to predict the IRI from the PCI of each group.
The dual model obtained through this iterative process will be the one that achieves the best R2 values in the predictive models for each group-specific linear regression, while also minimizing the classification errors in the multinomial logistic regression. Consequently, this model will be distinguished by its ability to optimize both the accuracy of the predictions and correctly classify data into their respective groups.
Statistical analyses were performed using IBM SPSS Statistics 26.0. The analysis of the linear regression results was based on the coefficient of determination (
R2), the adjusted coefficient of determination (adjusted
R2), and the significance level. The
R2 coefficient measures the proportion of the variation in the dependent variable about its mean and is explained by the independent variables. The
R2 coefficient can vary between 0 and 1. If the regression model is estimated and applied appropriately, researchers can assume that a higher
R2 value indicates the greater explanatory power of the regression equation and, consequently, the better prediction of the dependent variable. Values of 0.6 or higher are considered acceptable [
42]. In addition, the mean absolute error (MAE) and root-mean-square error (RMSE) were calculated for each group. The MAE measures the average of the absolute errors, providing a measure of how much the predictions deviate, on average, from the observed values. A lower MAE indicates that the predictions are, on average, closer to the true values. The RMSE measures the square root of the mean squared errors, penalizing larger errors, as these have a greater impact when squared. A lower RMSE indicates better model accuracy, and its value is more sensitive to large errors compared to the MAE.
Multinomial logistic regression was used to determine the membership of each observation in the established groups. Multinomial logistic regression is a statistical method used for modeling the relationship between multiple independent variables and a categorical dependent variable with more than two distinct categories [
42]. The outcome of a multinomial logistic regression is a set of equations representing the likelihood of an observation falling into each category. It is commonly used to identify independent variables that exhibit a robust association with the studied dependent variable [
43]. In this study, the categorical variable was the group: Upper, Middle, or Lower. For the independent variables, the analysis incorporated the variables influencing pavement deterioration, including pavement distresses, traffic characteristics, and climatic conditions.
The analysis of the multinomial logistic regression results used
p-values to identify variables that significantly influence the dependent variable. Also employed were the Cox–Snell and Nagelkerke coefficients [
42]. Three key metrics were used to evaluate the model’s performance in extracting information:
precision,
recall, and
F1 score [
44,
45]. In addition, to evaluate the accuracy of the model, the errors of the multinomial logistic regression were analyzed by comparing the predicted values with the observed values for membership in each group. The Cox–Snell and Nagelkerke coefficients differ in their calculation methods but are conceptually analogous. They can be likened to the
R2 value in a linear regression as they serve as an indicator of the model’s substantive significance [
42].
To calculate the key metrics of precision, recall, and F1 score, it is necessary to understand three terms: true positives, false positives, and false negatives. True positives refer to the instances in which the model correctly predicts a positive outcome. False positives occur when the model incorrectly classifies a negative instance as positive. Lastly, false negatives happen when the model wrongly classifies a positive instance as negative.
Precision measures the proportion of correct predictions of a specific class among all instances classified as belonging to that class. In the context of multinomial logistic regression, high
precision indicates that the model has a low false-positive rate. It is useful when the objective is to minimize classification errors in a particular category. However,
precision should be interpreted in conjunction with
recall to provide a comprehensive understanding of classification efficiency.
Precision is calculated as follows:
Recall assesses the proportion of true positives correctly identified of the total instances that belong to the class in question. A high
recall value signifies that the model captures most of the true cases within a class, though it may compromise
precision if it includes false positives.
Recall is calculated as follows:
The
F1 score is the harmonic mean of
precision and
recall. This metric is particularly useful when it is desirable to balance
precision and
recall, as it provides a single value representing the overall model performance, especially in cases with class imbalance (i.e., some classes have significantly more instances than others). The multiplication by two in the
F1 score formula ensures that
precision and
recall contribute equally to the metric, as it is based on the harmonic mean, which emphasizes lower values and prevents one metric from disproportionately influencing the result. A high
F1 score indicates that the model achieves both strong
precision and
recall, making it a robust measure of performance in multinomial classification tasks. The
F1 score is calculated as follows:
2.3. Validation
Validation of the developed model is essential to ensure its robustness. As indicated above, 30% of the randomly selected data points were reserved for validation. Both the classification and prediction models were validated. For the classification model, three key metrics were analyzed to assess its performance:
precision,
recall, and
F1 score [
44]. Moreover, the number of hits was analyzed with respect to the number of misses for each group. For the prediction model, the PCI values predicted by the model were statistically compared with the observed values, and the
R2 was calculated to assess the robustness of the model [
42]. Additionally, the errors of the linear regressions were analyzed using the MAE and RMSE.
3. Results
3.1. Data Collection and Preparation
After data collection and preparation, the total sample had 2044 data points, which were collected from 407 different road sections spread over 30 different US states. This varied geographic distribution ensures that the trained model takes into account a wide range of climatic, traffic, and pavement characteristics. The relationships between the variables were examined, revealing the presence of multicollinearity among the traffic-related variables (KESAL_CUM, AADT_CUM, ADDTT_CUM). To address this multicollinearity in the variables, given the specificity of the data analyzed, the variable KESAL_CUM was chosen. This decision was based on the fact that the KESAL_CUM variable provides a comprehensive representation of traffic conditions, integrating light (AADT_CUM) and heavy (AADTT_CUM) vehicle data, and simplifies the model without compromising the integrity of the essential information.
The characteristics of the 2044 individual data points were analyzed in greater detail, taking into account the ranges of the IRI, PCI, and the various types of pavement distresses.
Figure 3 shows the distribution of the data according to the PCI ranges defined by ASTM International [
17], indicating that the data cover all pavement conditions, from PCI 0 to PCI 100. However, it should be noted that no PCI classification is applied in this study; instead, the PCI values are directly employed as a continuous variable for the analyses performed. Similarly,
Figure 4 presents the distribution of IRI values, offering insight into the variability of pavement roughness.
Table 2 shows basic descriptive statistics for each type of pavement distress. For all pavement distresses, the minimum value and the lower quartile are zero. It can be seen that the most common pavement distresses are non-wheel-path longitudinal cracks and transverse cracks.
3.2. Dual Model Development
A random subset of 70% of the sample was selected to train the predictive model. To establish the initial model, the IRI data points were plotted against the PCI values, as shown in
Figure 5 along with the equation for the fitted regression line and the coefficient of determination. The
R2 value is low (
R2 = 0.31) for this initial regression, which is expected as the full dataset corresponds to a wide geographical area over a long period of time. This value is similar to that obtained by Piryonesi and El-Diraby [
10] in their study (0.3) when comparing the IRI and PCI using the LTPP database. As noted by these authors, the data were collected by different agencies using different types of technologies and under different environmental conditions, so the variance of the data may greatly affect the correlation between the IRI and PCI.
Since the
R2 value obtained is small, an iterative process was performed to maximize the value. As explained in
Section 2, this iterative process consists of three stages: (1) splitting the data into three groups (Upper, Middle, and Lower) according to the percentage difference between the predicted and observed values
p, (2) obtaining the classification model through multinomial logistic regression, and (3) determining the prediction model for each group. The iterative process concluded that
p must be 20%. Therefore, the Upper and Lower groups must be composed of those data that differ by more than 20% from the predicted value obtained through the initial model. The results suggest that the PCI and IRI have a better fit when the data are divided into three groups, leaving out of the central group those data in which the IRI differs more than 20% from the predicted value. In this way, the top and bottom groups can be identified by logistic regression, allowing a more accurate linear regression to be determined for them.
In the second stage, the classification model was established using multinomial logistic regression to determine the membership of each observation in one of the established groups. This model took into account variables such as pavement distresses, traffic characteristics, and climatic conditions. The results in
Table 3 show the statistically significant independent variables in each data group at a level of significance of
p < 0.05. This threshold indicates that the probability of observing the results due to random chance is less than 5%, ensuring a robust identification of significant predictors. The training dataset comprises a total of 1431 data points, distributed as follows: 473 data points belong to the Lower group, 631 belong to the Middle group, and 327 belong to the Upper group.
The likelihood of a road belonging to the Upper group increases with the presence of medium- and high-severity transverse cracks, particularly those of high severity, as well as low-severity sealed transverse cracks and low-severity patches. Conversely, high-severity longitudinal non-wheel-path cracks, low-severity sealed longitudinal non-wheel-path cracks, and higher traffic volumes (KESAL) decrease this probability. These findings align with the idea that a higher number of cracks is associated with an increased IRI [
46]. Similarly, a greater number of patches raises the likelihood of recording a higher IRI [
34]. In addition, longitudinal non-wheel-path cracks do not affect the IRI, as it is measured within the wheel path. Regarding traffic (KESAL), these results are consistent with the sample data, which show less traffic in the Upper group and more in the Lower group. It could be inferred that roads designed to handle dense traffic are constructed with better quality, whereas those with less traffic may have poorer pavement that deteriorates more quickly.
The presence of longitudinal non-wheel-path cracks, regardless of their severity, increases the probability of being categorized in the Lower group. As previously mentioned, these cracks do not influence the IRI since it is measured in the wheel path. Bleeding and aggregate polishing also increase the likelihood of belonging to this group, albeit to a lesser degree. These distresses are likely to generate a smoother pavement and, therefore, are not expected to be closely related to the IRI. In contrast, medium- and high-severity transverse cracks and accumulated precipitation decrease this probability.
This multinomial logistic regression yielded a Cox–Snell coefficient of 0.31 and a Nagelkerke coefficient of 0.35. Although these values are not high, three key metrics were used to evaluate the model performance:
precision,
recall, and
F1 score (
Table 4). In the Lower group, the
precision indicates that 62% of the instances that were classified as positive in this group were true positives. The
recall shows that the model correctly identified 59% of the positive cases, and the
F1 score of 0.61 suggests a moderate balance between
precision and
recall. In the Middle group, the
precision is slightly lower. However, the
recall is high (0.74), implying that the model correctly identified most of the positive cases. The
F1 score (0.63) is similar to that of the Lower group, suggesting that although the
precision is low, the high
recall helps to maintain a reasonable balance. In the Upper group, although the
precision is high (0.68), the
recall and
F1 score are lower than in the other groups, suggesting that the model achieved a weaker performance in this group. Given these results, the errors of the multinomial logistic regression were also analyzed by comparing the predicted and observed values for membership in each group (
Table 5). Although certain discrepancies were identified between the model predictions and the observed values, the proposed model provides a satisfactory prediction, as these discrepancies do not significantly affect the
R2 value of the linear regressions, as discussed in
Section 3.3. These discrepancies are mainly between the Middle and Lower groups or the Middle and Upper groups. That is, the model rarely predicts that data from the Upper group are in the Lower group or vice versa; such discrepancies would substantially affect the
R2 value. Given the characteristics and distribution of the data, as well as the regression equations obtained, the discrepancies that are observed do not significantly affect the fit of the model, which presents a high
R2 value in each of the groups.
Once the groups were defined and the classification model established, a predictive model based on linear regression was developed for each group. These models present
R2 values of 0.62, 0.82, and 0.72, respectively, for the Upper, Middle, and Lower groups (
Figure 6), indicating varying degrees of explanatory power between the groups. The
R2 value of 0.82 for the Middle group suggests a high predictive ability within this range, implying that the model captures the variance of the data effectively for this subset. For the Upper and Lower groups, the
R2 values of 0.62 and 0.72, while acceptable, are lower than in the Middle group. This lower predictive accuracy, especially in the Upper group, could be attributed to the smaller sample (22.85% of the training dataset) size and the higher dispersion of the data within this group. The same is reflected in the MAE and RMSE results. of each group. The Lower group has an MAE of 0.24 and an RMSE of 0.08, showing reasonable performance in this group with moderate errors. Similarly, the Middle group has an MAE of 0.27 and an RMSE of 0.07. However, the Upper group has an MAE of 0.45 and an RMSE of 0.1, highlighting that the model reduces the predictive capacity for this category. These disparities across the groups highlight the importance of segmenting the data to improve predictive accuracy.
3.3. Validation
For validation, the remaining random subset of 30% of the data was used. The validation process began by applying the classification model to determine the group to which each data point belongs. Following this, the appropriate prediction model was applied according to the assigned group. The validation dataset comprises a total of 613 data points, distributed as follows: 181 data points belong to the Lower group, 363 belong to the Middle group, and 69 belong to the Upper group.
The model validation yielded a coefficient of determination (R2) of 0.89, confirming the robustness of the proposed model. Additionally, during the validation process, the IRI values predicted by the model were statistically compared with the observed values, and the MAE and RMSE of each group were analyzed. The Lower group has an MAE of 0.24 and an RMSE of 0.34, showing reasonable performance in this group with moderate errors. Similarly, the Middle group has an MAE of 0.25 and an RMSE of 0.34. However, the Upper group has an MAE of 0.50 and an RMSE of 0.59, highlighting that the model reduces the predictive capacity for this category. This discrepancy can be attributed to several factors. First, the smaller amount of training data available for the Upper group limits the model’s ability to learn patterns within this category. Additionally, pavement deterioration in higher IRI ranges (Upper group) tends to become less linear, which makes it more challenging for the model to accurately represent these variations. The greater variability within this data further amplifies these challenges, contributing to the observed lower performance for this group.
Table 6 presents the results obtained for
precision,
recall, and
F1 score. The Middle group presents better results than the others, reaching an
precision of 0.80 and a
recall of 0.61. This means that 80% of the predictions made for this class were correct, and the model identified 61% of the real cases in this category. The
F1 score of 0.69 indicates an acceptable balance between
precision and
recall, suggesting that the model performs reasonably reliably for this class. For the Lower and Upper groups, the model performance metrics are much lower. In addition, the low
F1 score evidence difficulties in classifying these categories effectively, possibly due to a lack of distinctive features in the data or an imbalance in the numbers of representative observations.
To further analyze the model’s predictive performance, the prediction errors were also analyzed (
Table 7). It is observed that, as in the training phase, the discrepancies between the predictions and the observed values are between the Middle and Lower or Middle and Upper groups, so it is considered, given the characteristics and distribution of the data and the regression equations obtained, that these discrepancies do not significantly affect the fit of the model, which presents a high
R2 value. This is because errors occur in data points that are close to the boundary between groups, and even if some classifications are incorrect, the predictive model’s error remains small and does not significantly affect the model’s fit. Therefore, after the analysis of the results of the multinomial logistic regression and given a coefficient of determination
R2 of 0.89 in the validation of the model, the prediction model is considered to be robust.
4. Discussion
The model developed in this study uses a novel methodology to predict the IRI–PCI relationship across diverse climatic and traffic conditions considering pavement distress, climatic and traffic information. It employs multinomial logistic regression to classify the IRI–PCI relationship according to pavement distress, climate, and traffic conditions, subsequently applying the most appropriate prediction model for each group. Some pavement distresses do not influence the IRI but do influence the PCI, so with this methodology, it is possible to obtain satisfactory IRI predictions directly from the PCI.
Evaluation of validation metrics, including precision, recall, and F1 score, demonstrates that the model performs well overall. However, the Upper group exhibits lower precision and recall due to limited training data and greater variability within this subset, suggesting an opportunity for further refinement in this area. Future efforts should address data limitations for smaller and more variable subsets like the Upper group. Enhancing differentiation between groups and incorporating additional variables could further improve predictive accuracy and extend the methodology’s applicability.
A comparison with prior studies underscores the strengths of the proposed model. For instance, Piryonesi and El-Diraby [
10], using the LTPP database, achieved
R2 values exceeding 0.7 in some cases after grouping data by location and functional class, but their context was highly specialized and limited in scale. Similarly, Park et al. [
14] obtained an
R2 of 0.59 with a small dataset comprising 63 data points from 20 road sections across 11 states. In contrast, this study’s dual-model approach achieves
R2 values of 0.62, 0.72 and 0.82 using a large, geographically diverse sample of 2044 data points from 407 road sections across 30 US states, demonstrating superior robustness and applicability.
Other studies, such as those by Mactutis et al. [
47], Chandrakasu and Rajiah [
46], and Kirbaş [
34], have explored the relationship between the IRI and specific pavement distresses, including cracks, rutting, and patches. While these studies provide valuable insights into the impact of individual distresses on the IRI, their predictive models often focus on localized contexts or specific distresses. For instance, Mactutis et al. [
47], and Chandrakasu and Rajiah [
46] emphasized the role of cracks and rutting, while Kirbaş [
34] identified patches, alligator cracks, and depression as significant contributors to increased IRI, with bleeding having minimal impact. In alignment with these findings, the current study confirms the critical role of cracks in group classification and reveals that patches increase the likelihood of belonging to the Upper group, whereas bleeding is associated with the Lower group.
By enabling IRI estimation from pavement distress and climatic and traffic conditions, this approach could potentially be applied in management systems that use artificial intelligence image processing techniques to determine pavement distress. Applying this dual model, the IRI can be estimated more economically, which will be beneficial for estimating criteria such as user costs in optimal maintenance planning. This could significantly reduce the costs associated with road condition evaluation and maintenance management while promoting scientifically informed decisions and extending road service life.
5. Conclusions
This study addresses the lack of comprehensive IRI prediction models suitable for varied climatic regions and traffic levels and multiple distress types, which represents a major barrier to the efficient and sustainable management of pavement maintenance. To fill this gap, data from the LTPP database were used to develop a dual-model approach. First, a multinomial logistic regression was implemented to classify pavement sections based on pavement distress, climatic, and traffic conditions. Subsequently, according to this classification, a specific linear regression between the IRI and PCI was applied. The results demonstrated satisfactory predictive accuracy, with R2 values of 0.62, 0.72, and 0.82 for the linear regressions and an overall R2 of 0.89 for the validation model, making this methodology applicable to roads with a wide variety of climatic and traffic conditions. Furthermore, it was observed that deteriorations, together with the PCI, have a significant influence on the prediction of the IRI.
The proposed methodology in this study offers tangible benefits for pavement management systems. By reducing inspection costs and improving the reliability of pavement condition assessments, it provides infrastructure managers with a clearer understanding of pavement performance. The model’s integration with artificial intelligence technologies, such as automated image processing, allows for the extraction of pavement distress data and the calculation of the PCI. With this data, along with readily available climate and traffic information, all the necessary variables are obtained to calculate the IRI. This enables efficient, cost-effective, and accurate IRI estimation, allowing agencies to make informed, data-driven decisions for optimal maintenance planning, promoting extended road service life, and reducing lifecycle costs.
However, it is essential to acknowledge that the accuracy of IRI prediction based on the PCI decreases for higher IRI values. This reduction in predictive performance can be attributed to an imbalance in the dataset, with fewer observations at higher IRI levels, as well as the greater variability typically associated with these values. Future research should focus on strategies to address this challenge, such as collecting additional data points for higher IRI ranges and further investigating the relationship between the IRI and PCI under these conditions. This limitation emphasizes the necessity for the careful monitoring of pavement deterioration trends to ensure optimal use of the model. Maintenance agencies must remain vigilant in observing pavement distress and exercise prudent application of the predictive model. Furthermore, certain limitations, such as the restricted IRI range of 0–4 and the limited consideration of international context, should be addressed to enhance its applicability. Expanding the database to include global datasets with diverse climatic and road conditions would improve the model’s generalizability. Additionally, the potential impact of data quality on prediction accuracy underscores the need for standardized data collection practices across regions.
While this approach offers a practical and accessible solution for road infrastructure managers, its reliance on linear regression may oversimplify the non-linear relationships inherent in pavement performance. To address this, future research could explore alternative modeling approaches, such as non-linear regression or machine learning techniques.
Beyond expanding the database, future studies should investigate additional variables that influence the IRI–PCI relationship, including subgrade material properties, pavement age, friction levels, macrotexture, and mechanical properties derived from Falling Weight Deflectometer or Lightweight Deflectometer tests. Incorporating these factors into the model could provide a more nuanced understanding of pavement performance and support the development of advanced predictive frameworks. Additionally, leveraging automated data collection technologies, such as AI-based image processing, would further improve data acquisition efficiency and accuracy, facilitating proactive monitoring and enabling maintenance agencies to address issues before they become critical.
For maintenance agencies, the model offers a practical framework to proactively monitor pavement conditions. Agencies are encouraged to incorporate this methodology into their asset management systems, along with the use of advanced image processing technologies, which could further streamline the application of this model, reducing inspection times and operational costs, while ensuring timely interventions to avoid critical road failures.
Addressing these challenges and opportunities in future research has the potential to significantly enrich our understanding of the IRI–PCI relationship and drive improvements in road maintenance and management practices, ensuring their long-term safety and durability.