5.1. Multiple Linear Regression Model
After grouping the stretches with identical characteristics on pavement structure, age and traffic volume in the two-lane roads of the network of Biscay with flexible pavements, 105 stretches were available for modeling. The dependent variable is the mean IRI in these sections, and the possible predicting variables (independent) were
Age,
R.Age,
TotBit,
SSbit,
SStot,
SthCS,
AADT,
H.AADT,
TotVeh and
TotH.Veh. Initially, the correlation between the dependent variable and each of the dependent variables was carried out by means of the Pearson coefficient,
R, indicating the significance of that correlation (
Table 3).
As seen, the best correlations were obtained with
Age,
R.Age and
TotBit. Correlations with
SthCS,
SSbit,
SStot,
TotVeh and
TotH.Veh were also high. On the contrary,
AADT and
H.AADT showed low correlation with very low significance. In the next step, possible transformations of variables were analyzed. It was studied the curves that best fit the relationship between the dependent variable and each of the independent variables.
Table 4 shows those equations (curves) that best fit the relationship. Sometimes, a quadratic or a cubic curve fit better, but if the improvement between the linear correlation and other ones in the coefficient of determination is very low (Δ
R2 < 0.05) a linear model was maintained, implying that the independent variables were not transformed.
As shown, different transformations can be made for improving the correlation between each independent variable and IRI (the dependent variable). Age improves its correlation by means of a potential transformation (y = abx), but R.Age showed a better correlation by an exponential relationship (y = eax) (ExpR.Age). For TotBit, SSbit and SStot a natural logarithm transformation is suggested to improve the value of R (LnTotBit, LnSSbit and LnSStot). For AADT and H.AADT, an inverse transformation is recommended, but the value did not improve considerably. For TotVeh and TotH.Veh a quadratic transformation was applied (TotVeh2 and TotH.Veh2).
With the transformed variables, the influence of each variable in a multiple linear regression model was tested using forward stepwise regression analysis, by means of Version 24 of the SPSS Statistics software. Moreover, additional multiple linear regressions models were tested, combining different variables and their transformations. Apart from a high coefficient of determination, it was imposed that all the variables were significant (according to the individual Student’s
t-test).
Table 5 shows some of the analyzed models.
From the summary of tested models in
Table 5, it can be observed that the variable
ExpR.Age was always significant, a better correlation was obtained with Ln
TotBit than with Ln
SStot o Ln
SSbit and the cumulated number of heavy vehicles and total vehicles that crossed the section, since it was opened until the moment of the IRI data collection was also significant if introduced with the quadratic transformation (
TotVeh2 and
TotH.Veh2). The best two models were the last ones. The last one employed the cumulated number of heavy vehicles with a coefficient of determination of 0.437 (
R2 = 0.437) and the one before the last one used as the variable for the traffic the cumulated number of vehicles of any type, with a better coefficient of determination (
R2 = 0.472). Traditionally, deterministic IRI performance models include the accumulated heavy traffic as a variable for increasing the roughness as heavy vehicles are said to be responsible for pavement damaging, due to its higher weight, implying a greater load over the pavement. As a consequence, the parameter ESAL is usually deployed. From the meaning of the ESAL [
13], it can be deduced that the passenger cars (light vehicles) do not imply great damages on the road, as the effect of heavy vehicles is similar to one produced by thousands of passenger cars.
The explanation of the better correlation between IRI and the total vehicles (and not only total heavy vehicles) may be the high correlation between all these variables (
TotVeh and
TotH.Veh and their quadratic transformation) (
Table 6), which made that they can substitute each other in the models, but, as seen in
Table 5, if both were included, at least one of them became insignificant (7
th proposed model in
Table 5). This high correlation between the variables was originated because the percentage of heavy vehicles in all the analyzed roads, two-lane roads, was similar, and can be stated to be between 5% and 8% of the total traffic in the majority of the roads.
Consequently, although the equation with the accumulated number of total vehicles provided a better correlation, the one with the accumulated number of heavy vehicles was selected for predicting IRI evolution, shown in Equation (14),
where
IRI is the predicted mean IRI (m/km) value of the stretch with identical variable values of age, traffic and thickness of bituminous layers in flexible pavements
R.Age is the real age of the pavement, calculated from the exact data of opening to traffic to traffic until the moment it is wanted, in decimal fraction, where 0.5 means six months.
TotBit is the total thickness of the bituminous layers in the flexible pavement, in cm.
TotH.Veh is the accumulated heavy vehicles that circulated I the period considered through the project lane (the lane with a greater quantity of heavy vehicles in the section), since it was opened to traffic to the moment wanted to be calculated, in thousands of heavy vehicles. Usually, both directions are supposed to have identical heavy traffic, half of the total.
The hypothesis of multiple linear regression models, commented in
Section 3.1, were fulfilled. As observed in
Table 7, the
F-test provided a
p-value < 0.001, which implied that the proposed relationship is true and the Student’s
t-test of the coefficients of the parameters showed that they were significant and different from 0 (the 95% confidence intervals for parameters do not include the zero). The Durbin-Watson statistics was 1.448, verifying the independence of errors and that there is no autocorrelation. The homoscedasticity was verified in
Figure 2a, where no patterns were detected. One Condition Index was over 30 (
Table 9), which could show some problems with multicollinearity. Nevertheless, there was no significant correlation between the variables which explain the model (
Age and
TotH.Veh could be related, but they only had a medium Pearson coefficient,
R = −0.557) and the Variance Inflation Factors were low (VIF < 10) for all the variables. Therefore, it can be stated that there was no multicollinearity. Residuals followed a normal distribution. Finally,
Figure 2b shows the plot of observed values vs. predicted values, showing that the observations were near the diagonal.
As it can be observed, selected variables showed the real effect on IRI progress. Whereas, age and accumulated heavy vehicles increased the value of the roughness index, the thickness of bituminous layers decreased it. With regard to the age of the pavement, it was concluded that it was better to include a variable indicating the exact age of the pavement, not only the difference between the year of construction and year of the IRI data collection. Moreover, as it can be regarded in the literature, Annual Average Daily Traffic (
AADT) and Annual Average Daily Heavy Traffic of the year of data collection were not parameters employed in IRI prediction because the accumulated number of heavy vehicles was the factor that really influences the road deterioration. Furthermore, when comparing the thickness of bituminous layers (
TotBit),
SSbit and
SStot, it was observed that
TotBit provided a better model. It can be deduced that, although the crushed stone in the unbound base or subbase contributed to the structural capacity, it was not as important as the contribution of the bituminous layers, which can be regarded when comparing the Young modulus of the materials (
Table 2). However,
SSbit, which was created to indicate the structural capacity of the bituminous layers better, did not provide a better correlation with the IRI (
Table 3 and
Table 4), and when introduced in the model, it developed a worse model than
TotBit (
Table 5). It can be concluded that the Young modulus was not a vital factor when trying to measure the structural contribution of each layer to the entire pavement structure. Other parameters should be chosen to consider the structural capacity, in a similar way to the Structural Number and its derivations.
5.2. Generalized Linear Model with a Qualitative Variable
Although the effect of the thickness of the bituminous layers was included in Equation (14), the influence of the employed bituminous materials was not considered, because
SSbit was discarded. Nonetheless, it seemed reasonable to think that different bituminous materials in the surface layer also influence IRI progression. Cracking and permanent deformation (rutting) is said to be the main distress in asphalt concrete layers [
7], and they affect directly the roughness [
57,
65,
66]. However, other pavement defects can appear, such as raveling, potholes, coating failure, stripping, which are not normally measured at the network level, but affect pavement roughness. Consequently, an additional qualitative variable was introduced in the model to evaluate the effect of the different bituminous surface materials in the IRI evolution. The variable was called
SurfType. Due to the quantity of each material in the surface layer: Eighty-two stretches with a semi dense asphalt concrete mix of type AC 16 surf S and 8 with a semi dense AC mix of type of AC 22 surf S, 14 stretches with a discontinuous mixing (BBTM 11A) or 1 section with porous Asphalt (PA 11), it was decided to establish the categories indicated in
Table 10.
As a qualitative variable was introduced, it was not possible to develop a multiple linear regression. In this case, a Generalized Linear Model (GLM) was employed. The GLM is the most general model of linear regression, including the multiple linear regression model with quantitative variables and the multiple regression models with qualitative and quantitative models at the same time, and hence, it includes all the models of analysis of variance (ANOVA) and covariance (ANCOVA) [
53]. This type of models can be developed with the majority of statistical software programs.
Several models were tried by combining the available variables
Age,
R.Age,
TotBit,
SSbit,
SStot,
SthCS,
AADT,
H.AADT,
TotVeh and
TotH.Veh and the qualitative variable,
SurfType. The aim was to only introduce variables that were significant, i.e., variables that really affect IRI progression. After multiple attempts, the model with the highest coefficient of determination (
R2) and all the variables significant is shown in Equation (15).
where IRI, Ln
TotBit,
TotH.Veh and
R.Age are defined in Equation (14), and
SurfType is a variable that considers the material of the bituminous surface layer and has the values of
Table 11:
Statistical analysis of the model of Equation (15) is shown in
Table 12 and
Table 13 and
Figure 3. As seen, the model improved its prediction, with a coefficient of determination,
R2 = 0.482, higher than the previous model, and with a lower standard estimated error (SEE = 0.583).
Table 12 shows the test of Between-Subjects effect of the model of Equation (15), where it can be observed that all the variables were significant (
p-value < 0.03).
Table 13 presents the estimations of the parameters (coefficients) of the variables included in the model. The plot of residuals of
Figure 3a allows observing that they were random and independent between them. As the plot of predicted values vs. standardized residuals was random (there are not any patterns), the residuals were independent. The residual variances were homogeneous, since the dispersion of the standardized residuals was similar among all the values of predicted values. Finally,
Figure 3b exhibits the plot of observed values vs. predicted values, showing that the observations were near the main diagonal.
A similar model with SurfType and TotVeh instead of TotH.Veh was tested, providing a slightly worse model with a coefficient of determination of 0.481. Therefore, both variables contributed almost similarly in the model, due to their significant correlation between them, because roads with a similar percentage of heavy traffic were used for the model. Nevertheless, in this case, the cumulated heavy traffic gave a better model and it was maintained in the model.
As seen, the material used in the surface layer also influences the IRI evolution, and should be considered in the model. The coefficient of determination (
R2) was improved, from 0.437 to 0.482. Moreover, the obtained coefficient of determination was similar to one of the models presented in
Table 1, around 0.50, which employed more variables. It was demonstrated that, apart from the structural capacity of the entire pavement section to resist the traffic loads and meteorological agents, the wearing course material also affects the roughness progress, and therefore, characteristics like aggregate gradation (dense, semi dense, discontinuous and porous mixes), coating binder (modified or not) must be taken into account in IRI models. In the analyzed case of Biscay, when comparing the influence of employed materials in two-lane roads, the discontinuous mixings (BBTM type) and porous asphalts showed less deterioration than AC 16 S mixing. Despite the voids that can be found in discontinuous mixings (between 10% and 20% of the volume) and in porous asphalts (>20% of the volume), it can be deduced that the modified bitumen provided the necessary stiffness to the mixing. However, maintaining constant the other variables, the AC 22 S exhibited a better evolution. Nevertheless, these conclusions about which is the better material for the surface layer cannot be generalized for other areas, and they are limited to the province of Biscay.
To better show how to use the equation, an example is included. The IRI values were predicted for a flexible pavement of a two-lane road, with 15 cm of bituminous layers, AC 16 surf S in the surface layer and an Annual Average Daily Heavy Traffic of 400 heavy vehicles/day in the project lane, with an increasing rate of 1% each year. The IRI values were predicted for a real age of 5, 10 and 15 years and are shown in
Table 14. The practical limit for IRI calculation is around an IRI value of 4 m/km, which is the maximum value recorded in the network of Biscay. Before that value, an M&R activity is usually conducted.