This study aimed to develop a method for micro-population estimation at the building unit scale, with a particular focus on estimating building height and usage. Additionally, we sought to ensure the applicability and scalability of our framework across regions by emphasizing the integration of generally available data, preferably open data. The model was tested in Japan, where the reference data are available for validation. Our research has yielded several findings.
4.1. Evaluation from Building Use Classification
First, the building use classification by the machine learning approach using two-stage forecasting flow showed the model’s strength in distinguishing detached from non-detached houses and identifying apartment buildings. The decision-making process of the machine learning models was analyzed using SHAP and Grad-CAM, revealing the key features that each model prioritizes in classifying building use and verifying the complementarity between the polygon data-based and aerial image-based model.
SHAP analysis showed that the relative uniformity in size and shape is a characteristic feature influencing the identification of detached houses. Additionally, features extracted from POI data, such as the proximity to restaurants and the spatial co-location patterns of various POIs, play a significant role in identifying multi-family housing (apartment buildings). These POI features capture the semantic context surrounding buildings, reflecting urban development patterns, and real-world spatial relationships [
65,
66]. However, it is important to note that urban developments and zoning regulations can vary by region, which should be taken into account when adopting the approach. The classification of detached offices remains a challenge requiring further refinement.
Grad-CAM analysis reveals that misclassification of detached houses often occurs when the model fails to adequately capture roof characteristics, particularly when recognizing roofs of attached buildings, such as warehouses and garages. Misclassification of apartment buildings often arises due to difficulty in recognizing the contours of buildings with complex shapes. To mitigate this problem, using higher resolution image data and improvement in model architecture to properly capture the complex structure of buildings are recommended.
The results of both SHAP and Grad-CAM analysis confirm that different machine learning models, polygon data based and aerial image based, utilize different features for building use classification, providing insights into the strengths and limitations of each of the models. The implementation of the two-stage forecasting approach enhances the efficacy of the proposed ensemble method, resulting in a more comprehensive and accurate building use classification.
The utilization of POI data presents two significant challenges. Firstly, the distribution characteristics of POI data may considerably vary across different regions. Even within Japan, the applicability of these characteristics is expected to differ between urban centers and suburban areas. Given that this study aims for international implementation, it is crucial to conduct a detailed examination of the variations in these characteristics across different countries and regions. The second challenge pertains to the completeness of POI data. Compared to Japan, developing countries, which this study intends to investigate in the further study, are likely to have limited availability of such data. Consequently, there is a concern that the model may become applicable only to data-rich areas, thus restricting its generalizability.
To address these challenges, we propose two potential improvements. Firstly, the construction of a model using representative data extracted from diverse regions through appropriate sampling techniques could be considered. This approach would enable the creation of a more versatile model that is not dependent on any specific region. Secondly, the creation of distinct models for different areas or land use zones could be effective. By selecting and applying the most suitable model based on regional characteristics, it is anticipated that accurate classification can be performed across diverse geographical contexts.
Future research should focus on implementing these proposed improvements and quantitatively evaluating their efficacy. Furthermore, it is essential to explore alternative spatial data sources that could potentially substitute for POI data, as well as develop new features that can be applied in data-scarce regions. This approach will enhance the model’s adaptability to various urban environments globally, thereby increasing its practical applicability in diverse international settings.
4.2. Terrain Impact on Building Height Estimation Accuracy
The results of our building height estimation analysis reveal that the accuracy of the estimations varies significantly across different regions. Areas with relatively flat terrain, such as Tokyo, Nagoya, and Osaka, yield higher estimation accuracies of over 85%. In contrast, to better understand the phenomenon of underperforming estimations, our preliminary analysis shows that errors are more prevalent in terrains with approximately 10% or greater elevation variation.
Several factors may contribute to the increased error margins in mountainous regions. Firstly, the techniques used for height estimation may be insufficient for the topographic complexity, which is characterized by substantial variability in elevation. This complexity complicates the estimation process, as the interplay between terrain and building heights may lead to increased uncertainty. Secondly, data resolution may be insufficient to capture the fine details in rugged terrains, as the DEM data used in this study are 30 m. This limitation can result in inaccuracies when estimating building heights, as the terrain variations may not be adequately represented in the input data. Model calibration is also one significant factor. Although our estimation model accounts for terrain correction, further adjustments or the development of new frameworks tailored specifically to mountainous environments may be necessary. The unique characteristics of these regions may require specialized approaches to mitigate the impact of terrain complexity on building height estimation.
When utilizing the number of floors derived from estimated height, it is important to consider that both metrics represent similar building characteristics. However, exploring the margin of differences between building height and floor count could provide additional insights. Additionally, factors such as zoning should be considered when determining average floor height [
67].
4.3. Population Allocation
The population allocation methodology has proven its performance in this study where the data are well maintained and commercialized [
15]. However, our micro-population estimation model revealed several areas for improvement. These areas can be categorized into three main trends: (1) overestimation in sparsely populated areas, (2) inaccuracies in areas with high concentrations of non-residential buildings, and (3) misclassification of industrial structures in port and harbor areas.
The first trend observed was overestimation in sparsely populated areas, such as mountainous regions. In this study, Hachioji City and Fukuoka City exhibited lower accuracy due to this issue. In such areas, even small differences in the number of residents can lead to significant errors due to the small population size. Additionally, since this method proportionally allocates population statistics by building units within a municipality, some areas may receive a disproportionately small allocation compared to the actual distribution. To address this, future research should explore methods that allow for population prorating on a more granular, underlying unit basis.
The second observed trend was lower accuracy in areas with a high concentration of non-residential buildings, particularly commercial facilities, such as downtown areas. In these regions, there was a tendency to misclassify non-residential buildings, especially commercial properties, as residential buildings. This misclassification led to an over-allocation of the population. To improve accuracy in these areas, it is necessary to refine the threshold for determining residential use, enhance the model, and develop methods for accurately classifying areas not designated for residential purposes.
Finally, the misclassification of warehouses and other facilities located in ports and harbors emerged as a significant issue. In Osaka and Nagoya, the population of warehouses in port areas was misclassified, resulting in decreased accuracy. This problem combines elements of the previous two issues, and it is characterized by a lack of former residents and the area not being used for residential purposes. In such areas, it is essential first to develop methods to exclude these regions from the target population and to improve the model accordingly.
These improvement measures are expected to lead to the development of a more accurate building-by-building population estimation method. Moving forward, addressing these challenges is essential for developing a versatile population estimation model that can be applied across a broader range of urban environments.
To further investigate these three trends, we conducted a comparative analysis of the relationship between error rates and land use zones by aligning Japanese land use zoning data with small-area boundaries. We utilized land use zoning data from the National Land Numerical Information database [
68].
Table 8 defines each land use zone, while
Figure 13 presents the average aggregated results of land use zones and error rates for each small area.
Notably, we observed differences in trends between Tokyo, where the model was developed, and other cities. In Tokyo, larger errors were more common in areas, such as Category I and Category II, Mid-/High-Rise-Oriented Residential Zone, where hospitals, universities, and stores can coexist with residences. The higher error rates in these areas are likely due to misclassifications between hospitals or certain-sized stores and apartment buildings or detached office buildings, and there is a tendency for taller buildings to exhibit greater estimation errors.
In contrast, predictions for areas outside Tokyo frequently showed misclassifications in exclusive industrial zones. This likely occurred because the Tokyo-based model did not include sufficient examples from these zones, leading to their classification as apartment buildings and the proportional allocation of population estimates. This finding corroborates previous observations of misclassifications in port areas.
Furthermore, both inside and outside Tokyo, commercial zones exhibited a tendency for overestimation. This overestimation likely stems from the misclassification of commercial facilities as apartment buildings combined with the general tendency for taller structures to produce larger estimation errors.
To address these issues, one approach is to appropriately sample training data from each land use zone during model development. In this study, large errors occurred in industrial areas not represented in the training data. By ensuring that training data include a diverse range of land use zones, we can mitigate such biases. Another strategy is to cluster areas based on their characteristics and develop separate models for each cluster or region. Our analysis revealed distinct error patterns between Tokyo, where the model was trained, and other regions. By clustering or grouping similar areas and constructing models tailored to these clusters, we can potentially improve prediction accuracy.
By adopting these methodologies and modeling approaches that account for regional characteristics, we aim to develop more effective population estimation techniques.
Our ultimate goal is to establish a robust methodology for generating reliable building-level demographic statistics in developing countries, which is essential for urban planning, resource allocation, and disaster management. Access to detailed data on both buildings and population enables us to conduct comprehensive analyses at the building unit level. An exemplar of this approach is illustrated in
Figure 14, which represents a prototype for integrating population data at the building unit level with vulnerability factors in risk assessment [
69]. This prototype classifies vulnerability levels for population distribution in each building, incorporating data from neighboring structures within a defined buffer zone.