1. Introduction
The increase in number of road traffic and rapid growth of urbanization pose a great health hazard for the surrounding environment and public health. Ambient air pollution is an important environmental issue [
1]. Based on global burden of death (GBD), 4.9 million deaths are attributed in the world because of air pollution exposure [
2]. Air pollution is one of the biggest concerns in the modern era because of improvement in the lifestyle, which requires more energy and exploration of resources, putting pressure on the generation of toxic air pollutants in the atmosphere. The emission of these air pollutants affect both climate and human health [
3,
4]. Studies have mentioned the effects of air pollution on human health, such as cardiovascular, respiratory and chronic diseases [
5,
6]. One of the Swedish cohort studies reports that exposure to long term air pollution may cause diabetes [
7]. Poor air quality is a serious issue in developing countries because of overpopulation, urbanization and industrialization [
8].
Pakistan is a South Asian country having a population crossing the figure of 200 million [
9]. The rapid increase in the population and unplanned urbanization with the recent development in the industrial units, has worsened the condition of ambient air in the country [
10,
11]. Transportation is another source of air pollution emitting 25 times more carbon monoxide (CO) and carbon dioxide (CO
2), and 3.5 times higher sulfur dioxide (SO
2) as compared to the automobiles in the United States [
12]. Pakistan is the second, among the top 10 polluted countries in the world accounting 22,000 premature deaths and 163,432 disability-adjusted life years (DALYS) lost [
13]. A study conducted in the cities of Pakistan (Islamabad, Lahore, Rawalpindi) reported the levels of nitrogen oxides (NO
x) and particulate matter (PM
10) higher than WHO guidelines [
14]. Pakistan Environmental Protection Agency (Pak-EPA) has monitored the level of NO
2 in different cities of Pakistan, and estimated that maximum and minimum concentrations were 37.02 ppb and 14.61 ppb in Karachi and Islamabad, whereas, another study found that the maximum and minimum concentrations were 37.46 ppb and 2.48 ppb, respectively [
14,
15]. Urban air pollution costs the country a loss of about Rs. 65 billion, from total annual loss to the environmental damages which is Rs. 365 billion [
13]. The financial loss occurs from morbidity and mortality linked with cardiovascular and respiratory diseases, lower respiratory illness (LRI) in children, however, if air pollution related problems are taken into consideration on education, malnutrition and earnings, the financial loss could be higher [
16].
Road traffic is classified as one of the reasons behind the deterioration of air quality in urban areas [
17]. Increase in the number of vehicles not only causes traffic congestion and greenhouse gas emissions, but imposes significant health impacts, and is a source of tropospheric ozone formation [
18]. Different air pollutants are present in the atmosphere of urban environment like particulate matter (PM), nitrogen dioxide (NO
2), carbon monoxide and dioxide (CO and CO
2) and ozone (O
3), but the pollutant which correlates well with traffic densities, and an important photochemical oxidant is nitrogen dioxide [
19,
20]. Nitrogen dioxide (NO
2) has a major role for the generation of secondary air pollutants (SAP) and its concentration is correlated with photochemical smog, acid deposition and ozone variations [
21,
22]. After particulate matter, NO
2 is the second most abundant and dangerous air pollutant in Pakistan [
14], therefore it is necessary to have the knowledge of different types of pollution sources and the factors affecting the concentration of pollutant within the city.
Different models and techniques are available, and have been used to access and model the air pollution concentration, such as dispersion models (DM), chemistry transport model (CTM) and other techniques like inverse distance weighting (IDW), ordinary kriging (OK), machine learning (ML), but the problem lies with the application of these models and techniques, is the high demand of data requirement, costs and the complexity [
23,
24,
25]. The issue lies with the applicability of interpolation techniques is the assumption that variation in the pollution is dependent on the distance between the sites, which may lead to error in estimating pollution [
26]. In contrast, land use regression (LUR) model has gained the attention as an easy and effective approach, to provide spatial distributions of air pollutants at intra-urban scale built on a specific number of monitoring sites and predictor variables values, gathered by geographic information system (GIS), [
27,
28], providing a reliable solution for air pollution exposure assessment for a developing country like Pakistan [
29].
LUR is a statistical method of air pollution modeling. It is commonly used to estimate variations in air pollutant concentrations for population exposure assessment. The technique links spatially heterogeneous air quality measurements with geospatial predictors. LUR models provide a comparatively robust method for spatial prediction, while having a lower sampling effort compared to geo-statistical models and a lower data requirement than dispersion models [
30].
LUR model incorporates air pollutants data, geographic predictors such as land use, population density and road traffic network data around the monitoring points and after multiple linear regression, it provides spatial annual or seasonal pollution level, at un-monitored locations [
31]. The model provides a simple and cost-effective method for air pollution exposure assessment at regional and intra-urban level by substituting expensive dispersion models [
30,
32].
Recently, LUR model is mostly applied in developed countries in North America and Europe, and the exposure assessment of different air pollutants have been successfully conducted [
33,
34,
35,
36]. Still, developed countries remain the focus of conducting air pollution studies related to exposure assessment and its health effects [
37]. Therefore, it is necessary to have studies related to exposure from air pollution, which can be helpful for epidemiological evidences in developing country conditions [
38], having air pollution-related health problems.
Challenges and constraints in the development of LUR models in these situations, includes inadequate availability of GIS data, emissions from air pollution sources not well interrelated and deficiency of routine monitoring concentration data. Pollutant’s concentration data gained from the national monitoring locations are used to characterize the air pollution of the entire city which would lead to evaluation error in public air pollution exposure [
39]. It would be meaningful to perform LUR studies to explore the performance of model, thus an efficient and economical air pollutant concentration model can be obtained.
To address the need of air pollution exposure assessment, this study will explore the applicability of LUR model to predict spatial variation of NO2 for Lahore, Pakistan in which emission sources include local industries, household fuel use and automobiles. This would not only provide the evidence of developing LUR models for different air pollutants in Pakistan, but also offer important application of exposure assessment with high representativeness. This study will provide a basic systematic LUR method to promote the wider use in other cities of Pakistan.
The aim of the present study was to develop an LUR model based on seasonal variation (pre-monsoon (April, May and June), monsoon (July, August), post-monsoon (September, October and November)) of NO2 concentration for Lahore city due to the lack of exposure data. To have the knowledge of different pollution variables effect on the concentration of pollutant for each season, and to depict its spatial distribution, LUR was applied in Pakistan, which will also provide the long-term epidemiological studies in air pollution in future. It is necessary to compare the three period models for better understanding of local emission sources and the effect of different potential predictor variables for each season.
4. Discussion
Land use regression model has been applied in the developed countries, but still there is lack of application of LUR model in developing countries [
37,
44]. To the best of our knowledge, this is the first attempt to apply LUR model in Pakistan’s urban area setting for a city, although LUR has been used in Pakistan at a national level for ambient PM
2.5 exposure [
29]. LUR models were developed for seasonal variation (pre-monsoon, monsoon and post-monsoon) mean concentration of NO
2 pollutant, based on the collected data of 18 monitoring locations in the Lahore city, Pakistan. The final developed LUR models performed well, showing the reliability with high accuracy and spatial heterogeneity.
Previous LUR models in the literature have showed the values of R
2 ranging from lowest (0.41) in the startup model to highest (0.73) in the final model, achieving an R
2 of 0.68 for the winter model and 0.59 for the summer model [
45]. The study conducted in Xian, China reported that the value of R
2 was greater than 0.8, indicating that the heating season had the best simulation effect [
46]. The study conducted in Nanjing, China reported the R
2 value of 0.7 for NO
2 model [
39]. The LUR models developed in this study, shows that the values of R
2 (0.5–0.61), like other studies conducted in the literature, thus indicating that it is feasible to develop these models for developing countries and can use for exposure assessment studies. The values of the model R
2 was close to the LOOCV R
2, showing the robustness of the LUR models for all the seasons [
44].
Data collected from manually surveying the study area, proved helpful to increase the performance of LUR models. Such type of survey can provide us the valuable specific feature of potential predictor variables in the study area, that would improve the overall fit of LUR model. In this study, vehicle maintenance workshops data were collected by doing the manual survey of the study area, which can be a source of NO
2, thus highlighting the importance of culture or site-specific land use classes [
47,
48]. Distance to vehicle maintenance workshop entered in the final models of all the seasons was found to be one of the influencing factors for the source contribution to NO
2.
Different potential predictor variables, entered in the final developed models, showed that the different factors have influence on the pollutant concentrations, although same predictor variables were used to develop the LUR models. The road network (road length) was the influencing factor, with NO
2 being known as the traffic-based air pollutant, showed the positive association with the road length factor in the final LUR models for all the seasons. The residential area seems to be another effective factor, showing a positive association with the concentration of NO
2. The reason was that people in the household use the fuel-based generators for the electricity and combustion processes, and also because people travel frequently around residential area, causing increase in the concentration of NO
2 due to vehicle exhaust emissions [
49]. Distance variable also showed the contribution in all the models, highlighting the importance of local automobile workshops in the emission of NO
2. Hospital area showed the contribution of NO
2 from a generators facility used in the vicinity of the hospital area, and local automobile workshop.
Electricity generation and its demand is a serious issue in developing countries, especially in Pakistan, which leads to the usage of fuel-based generators to produce electricity, in the households and hospitals, causing an increase in the pollutant concentration. The residential area also contributed as a source of NO2 pollution, due to the burning of fossil fuels in the households, especially during the heating season (post-monsoon) resulting in the highest concentration during all the seasons. Specific local data survey (SLDS) was helpful in determining another factor, which was vehicle maintenance workshops, surrounded around the roads and residence area, cause of increase in the concentration of pollutant due to maintenance of vehicles.
The significant predictor variables identified in this study were like other studies conducted previously, showing the length of roads and residential land area as common influencing factor [
50,
51]. The study conducted in Taipei city, Taiwan also reflected that the high concentrations attributed to road length, as one of the influencing factors and to the dense road network [
52]. Another study, also showed one of the influencing factors was major roads and traffic influence, included in the models for heating and non-heating seasons [
53]. The study conducted in Nanjing, China indicated that the residential area within 100m and 5000m buffer, entered in the final model, proved to be a significant predictor variable [
39]. The developed models based on the influencing predictor variables in this study comparable with the other studies conducted previously, supporting the LUR models in the urban settings of Lahore, Pakistan.
Among the three seasons, predicted concentration regression maps showed the similar spatial characteristics with high concentration in the city center, where the residential land area is high due to the population. Next to residential area land feature, the maximum concentration can be observed around the road network, which is mainly distributed in the center and to the west part of the city. The vehicle maintenance workshops also surrounded around the roads, and nearby the residential area and hospital area, showing the public living nearby those areas could have a negative effect on their health, due to exposure to the pollutants. Since NO2 pollutant concentration relates to traffic intensity and residential area, this spatial distribution is reasonable.
There were some limitations in this study. The selected measurements sites may not be able to capture the pollutant concentration distribution across the whole study area, because these sites were mostly in the city area, and therefor lack the coverage resolution of rural areas, which can be considered in the future studies. Although, NO2 concentration related to the industrial emissions, but due to limited availability of industrial data, did not enter in the final model, which can be considered by using the specific local data survey to capture the industrial sites.