Image Retrieval for Local Architectural Heritage Recommendation Based on Deep Hashing
Abstract
:1. Introduction
- We construct a new dataset CAH10 for traditional Chinese architectural heritage.
- We propose a deep-hashing-based retrieval method that can realize high recommendation accuracy. The analysis of the retrieved results reveals the relations among different architectural heritage categories.
- A data fine-tuning strategy is adopted to break the quantity restriction of local architectural heritage data. This strategy can also enable better image feature extraction of the retrieval model.
- For a better user experience, an application of image-to-location is provided for building a recommendation system.
2. Related Works
3. Data
- Source set: We use the images of the source set to pretrain our retrieval model. It contains 90% of the random split images from an image collection. The images from the collection are searched by using the keyword of each heritage class name and selected by a specialist based on the class definition. Selected images cover different regions and cultures; synthetic images are also included.
- Query set: The query set contains the remaining 10% of the image collection. All these images are used to evaluate the model retrieval accuracy on the source set or target set. This set also serves as the possible user input to demonstrate what will be retrieved by the model.
- Target set: This set contains the local architectural heritage images which are the entities that our retrieval system wants to recommend. We collect 285 images from Jiangxi, China. For the accessible image-to-location recommendation, each image is attached with its geographical coordinates. Notice that the images from the target set are excluded from both the source set and query set.
4. Methods
4.1. Problem Definition and Model Structure
4.2. Learning Similarity
4.2.1. Quantization Loss
4.2.2. Similarity Loss
4.2.3. Overall Loss
4.3. Data Fine-Tuning and Image Retrieval
5. Results
5.1. Implementation Details
5.2. Evaluation Metrics
5.3. Retrieval Performance
5.4. Results Comparison with Different Model Settings
5.5. Top Retrieval Results
6. Discussion
6.1. Advantages of Data Fine-Tuning
6.2. Retrieval with Geographical Coordinates
6.3. Limitations and Future Works
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Marty, P.F. Digital convergence and the information profession in cultural heritage organizations: Reconciling internal and external demands. Libr. Trends 2014, 62, 613–627. [Google Scholar] [CrossRef]
- Yilmaz, H.M.; Yakar, M.; Gulec, S.A.; Dulgerler, O.N. Importance of digital close-range photogrammetry in documentation of cultural heritage. J. Cult. Herit. 2007, 8, 428–433. [Google Scholar] [CrossRef]
- Navarrete, T. Digital cultural heritage. In Handbook on the Economics of Cultural Heritage; Edward Elgar Publishing: Cheltenham, UK, 2013. [Google Scholar]
- Calvanese, V.; Zambrano, A. A Conceptual Design Approach for Archaeological Structures, a Challenging Issue between Innovation and Conservation: A Studied Case in Ancient Pompeii. Buildings 2021, 11, 167. [Google Scholar] [CrossRef]
- Tejedor, B.; Lucchi, E.; Bienvenido-Huertas, D.; Nardi, I. Non-Destructive Techniques (NDT) for the diagnosis of heritage buildings: Traditional procedures and futures perspectives. Energy Build. 2022, 263, 112029. [Google Scholar] [CrossRef]
- Zou, Z.; Zhao, X.; Zhao, P.; Qi, F.; Wang, N. CNN-based statistics and location estimation of missing components in routine inspection of historic buildings. J. Cult. Herit. 2019, 38, 221–230. [Google Scholar] [CrossRef]
- Condorelli, F.; Rinaudo, F.; Salvadore, F.; Tagliaventi, S. A Neural Networks Approach to Detecting Lost Heritage in Historical Video. Isprs Int. J.-Geo-Inf. 2020, 9, 297. [Google Scholar] [CrossRef]
- Gumbarević, S.; Milovanović, B.; Gaši, M.; Bagarić, M. Application of Multilayer Perceptron Method on Heat Flow Meter Results for Reducing the Measurement Time. Eng. Proc. 2020, 2, 29. [Google Scholar]
- Bienvenido-Huertas, D.; Rubio-Bellido, C.; Pérez-Ordóñez, J.L.; Moyano, J. Optimizing the evaluation of thermal transmittance with the thermometric method using multilayer perceptrons. Energy Build. 2019, 198, 395–411. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional neural networks: An overview and application in radiology. Insights Imaging 2018, 9, 611–629. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Zhou, Y.; Liang, Y.; Pan, Y.; Yuan, X.; Xie, Y.; Jia, W. A Deep-Learning-Based Meta-Modeling Workflow for Thermal Load Forecasting in Buildings: Method and a Case Study. Buildings 2022, 12, 177. [Google Scholar] [CrossRef]
- Kim, J.; Yum, S.; Son, S.; Son, K.; Bae, J. Modeling Deep Neural Networks to Learn Maintenance and Repair Costs of Educational Facilities. Buildings 2021, 11, 165. [Google Scholar] [CrossRef]
- Llamas, J.; M Lerones, P.; Medina, R.; Zalama, E.; Gómez-García-Bermejo, J. Classification of architectural heritage images using deep learning techniques. Appl. Sci. 2017, 7, 992. [Google Scholar] [CrossRef]
- Yoshimura, Y.; Cai, B.; Wang, Z.; Ratti, C. Deep learning architect: Classification for architectural design through the eye of artificial intelligence. In Computational Urban Planning and Management for Smart Cities. CUPUM 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 249–265. [Google Scholar]
- Gupta, R.; Mukherjee, P.; Lall, B.; Gupta, V. Semantics Preserving Hierarchy based Retrieval of Indian heritage monuments. In Proceedings of the 2nd Workshop on Structuring and Understanding of Multimedia Heritage Contents, Seattle, WA, USA, 12–16 October 2020; pp. 5–13. [Google Scholar]
- Sipiran, I.; Lazo, P.; Lopez, C.; Jimenez, M.; Bagewadi, N.; Bustos, B.; Dao, H.; Gangisetty, S.; Hanik, M.; Ho-Thi, N.P.; et al. SHREC 2021: Retrieval of cultural heritage objects. Comput. Graph. 2021, 100, 1–20. [Google Scholar] [CrossRef]
- Oyedare, T.; Park, J.M.J. Estimating the required training dataset size for transmitter classification using deep learning. In Proceedings of the 2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Newark, NJ, USA, 11–14 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–10. [Google Scholar]
- Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How transferable are features in deep neural networks? arXiv 2014, arXiv:1411.1792. [Google Scholar]
- McInnes, L.; Healy, J.; Melville, J. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv 2018, arXiv:1802.03426. [Google Scholar]
- Chen, W.; Liu, Y.; Wang, W.; Bakker, E.M.; Georgiou, T.; Fieguth, P.W.; Liu, L.; Lew, M.S. Deep Image Retrieval: A Survey. arXiv 2021, arXiv:2101.11282. [Google Scholar]
- Gionis, A.; Indyk, P.; Motwani, R. Similarity search in high dimensions via hashing. In Proceedings of the 25th VLDB Conference, Edinburgh, UK, 7–10 September 1999; Volume 99, pp. 518–529. [Google Scholar]
- Raginsky, M.; Lazebnik, S. Locality-sensitive binary codes from shift-invariant kernels. Adv. Neural Inf. Process. Syst. 2009, 22, 1509–1517. [Google Scholar]
- Zhu, H.; Long, M.; Wang, J.; Cao, Y. Deep hashing network for efficient similarity retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
- Cao, Z.; Long, M.; Wang, J.; Yu, P.S. Hashnet: Deep learning to hash by continuation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5608–5617. [Google Scholar]
- Cao, Y.; Long, M.; Liu, B.; Wang, J. Deep cauchy hashing for hamming space retrieval. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1229–1237. [Google Scholar]
- Zhang, Z.; Zou, Q.; Lin, Y.; Chen, L.; Wang, S. Improved deep hashing with soft pairwise similarity for multi-label image retrieval. IEEE Trans. Multimed. 2019, 22, 540–553. [Google Scholar] [CrossRef]
- Yuan, L.; Wang, T.; Zhang, X.; Tay, F.E.; Jie, Z.; Liu, W.; Feng, J. Central similarity quantization for efficient image and video retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 3083–3092. [Google Scholar]
- Xia, R.; Pan, Y.; Lai, H.; Liu, C.; Yan, S. Supervised hashing for image retrieval via image representation learning. In Proceedings of the AAAI, QuEbec City, QC, Canada, 27–31 July 2014. [Google Scholar]
- Belhi, A.; Bouras, A. CNN Features vs. Classical Features for Largescale Cultural Image Retrieval. In Proceedings of the 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT), Doha, Qatar, 2–5 February 2020; pp. 95–99. [Google Scholar] [CrossRef]
- Liu, E. Research on image recognition of intangible cultural heritage based on CNN and wireless network. EURASIP J. Wirel. Commun. Netw. 2020, 2020, 1–12. [Google Scholar] [CrossRef]
- Wang, B.; Li, L.; Nakashima, Y.; Yamamoto, T.; Ohshima, H.; Shoji, Y.; Aihara, K.; Kando, N. Image Retrieval by Hierarchy-aware Deep Hashing Based on Multi-task Learning. In Proceedings of the 2021 International Conference on Multimedia Retrieval, Taipei, Taiwan, 21–24 August 2021. [Google Scholar]
- Cao, Y.; Long, M.; Wang, J.; Zhu, H.; Wen, Q. Deep quantization network for efficient image retrieval. In Proceedings of the Thirtieth AAAI Conference, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
- Zhang, J.; Fukuda, T.; Yabuki, N. Development of a City-Scale Approach for Façade Color Measurement with Building Functional Classification Using Deep Learning and Street View Images. ISPRS Int. J.-Geo-Inf. 2021, 10, 551. [Google Scholar] [CrossRef]
- Mikołajczyk, A.; Grochowski, M. Data augmentation for improving deep learning in image classification problem. In Proceedings of the 2018 International Interdisciplinary PhD Workshop (IIPhDW), Swinoujscie, Poland, 9–12 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 117–122. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F.-F. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 248–255. [Google Scholar]
- Zhang, Y.; Ling, C. A strategy to apply machine learning to small datasets in materials science. NPJ Comput. Mater. 2018, 4, 25. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Järvelin, K.; Kekäläinen, J. IR Evaluation Methods for Retrieving Highly Relevant Documents. Available online: https://dl.acm.org/doi/abs/10.1145/3130348.3130374 (accessed on 9 May 2022).
- Baeza-Yates, R.; Ribeiro-Neto, B. Modern Information Retrieval; ACM Press: New York, NY, USA, 1999; Volume 463. [Google Scholar]
- Weiss, Y.; Torralba, A.; Fergus, R. Spectral hashing. Adv. Neural Inf. Process. Syst. 2008, 21, 1753–1760. [Google Scholar]
- Kulis, B.; Darrell, T. Learning to hash with binary reconstructive embeddings. Adv. Neural Inf. Process. Syst. 2009, 22, 1042–1050. [Google Scholar]
- Shen, F.; Shen, C.; Liu, W.; Tao Shen, H. Supervised discrete hashing. In Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 37–45. [Google Scholar]
- Gong, Y.; Lazebnik, S.; Gordo, A.; Perronnin, F. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 2916–2929. [Google Scholar] [CrossRef]
- Lai, H.; Pan, Y.; Liu, Y.; Yan, S. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3270–3278. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
Class | Source Set | Target Set | Query Set | Total |
---|---|---|---|---|
Bridge | 267 | 12 | 28 | 307 |
Ancestral hall | 299 | 40 | 26 | 365 |
Palace | 151 | 25 | 18 | 194 |
Residential building | 463 | 73 | 51 | 587 |
Pavilion | 208 | 13 | 19 | 240 |
Memorial archway | 281 | 31 | 20 | 332 |
Temple | 221 | 33 | 17 | 271 |
Theatre | 160 | 14 | 15 | 189 |
Modern historic | 256 | 12 | 28 | 296 |
Tower | 246 | 32 | 21 | 299 |
Total | 2552 | 285 | 243 | 3080 |
Method | ACG@50 | MAP@50 | ||||||
---|---|---|---|---|---|---|---|---|
16-bit | 32-bit | 48-bit | 64-bit | 16-bit | 32-bit | 48-bit | 64-bit | |
SH [44] | 0.421 | 0.514 | 0.585 | 0.602 | 0.447 | 0.539 | 0.605 | 0.621 |
LSH [23] | 0.308 | 0.429 | 0.450 | 0.483 | 0.325 | 0.444 | 0.487 | 0.512 |
BRE [45] | 0.332 | 0.395 | 0.522 | 0.551 | 0.356 | 0.422 | 0.545 | 0.562 |
SDH [46] | 0.411 | 0.452 | 0.555 | 0.512 | 0.432 | 0.464 | 0.572 | 0.530 |
ITQ-CCA [47] | 0.500 | 0.543 | 0.585 | 0.561 | 0.519 | 0.560 | 0.601 | 0.573 |
CNNH [30] | 0.618 | 0.656 | 0.701 | 0.715 | 0.632 | 0.673 | 0.714 | 0.732 |
DNNH [48] | 0.593 | 0.677 | 0.717 | 0.725 | 0.611 | 0.692 | 0.733 | 0.735 |
DHN [25] | 0.580 | 0.674 | 0.688 | 0.705 | 0.595 | 0.693 | 0.702 | 0.726 |
HashNet [26] | 0.720 | 0.775 | 0.762 | 0.745 | 0.728 | 0.795 | 0.770 | 0.742 |
Ours | 0.731 | 0.792 | 0.771 | 0.752 | 0.738 | 0.812 | 0.788 | 0.765 |
Backbone | ACG@10 | MAP@10 | ||||||
---|---|---|---|---|---|---|---|---|
16-bit | 32-bit | 48-bit | 64-bit | 16-bit | 32-bit | 48-bit | 64-bit | |
ResNet-18 | 0.753 | 0.788 | 0.755 | 0.724 | 0.767 | 0.801 | 0.770 | 0.736 |
ResNet-50 | 0.811 | 0.837 | 0.822 | 0.801 | 0.816 | 0.847 | 0.838 | 0.795 |
ResNet-101 | 0.792 | 0.803 | 0.788 | 0.770 | 0.809 | 0.813 | 0.799 | 0.783 |
DenseNet-121 | 0.712 | 0.749 | 0.740 | 0.710 | 0.723 | 0.765 | 0.758 | 0.721 |
EfficientNet-b4 | 0.795 | 0.821 | 0.791 | 0.758 | 0.808 | 0.833 | 0.812 | 0.787 |
Inception-V4 | 0.704 | 0.741 | 0.733 | 0.702 | 0.711 | 0.758 | 0.741 | 0.713 |
ACG@10 | MAP@10 | |||||||
---|---|---|---|---|---|---|---|---|
16-bit | 32-bit | 48-bit | 64-bit | 16-bit | 32-bit | 48-bit | 64-bit | |
0 | 0.771 | 0.804 | 0.795 | 0.772 | 0.787 | 0.809 | 0.805 | 0.781 |
0.01 | 0.808 | 0.829 | 0.820 | 0.783 | 0.795 | 0.814 | 0.817 | 0.788 |
0.1 | 0.811 | 0.837 | 0.822 | 0.801 | 0.816 | 0.847 | 0.838 | 0.795 |
1.0 | 0.800 | 0.824 | 0.817 | 0.785 | 0.801 | 0.821 | 0.810 | 0.780 |
10.0 | 0.721 | 0.755 | 0.741 | 0.718 | 0.733 | 0.763 | 0.748 | 0.711 |
n | ACG@n | MAP@n |
---|---|---|
2 | 0.807 | 0.821 |
4 | 0.816 | 0.825 |
6 | 0.824 | 0.833 |
8 | 0.830 | 0.838 |
10 | 0.832 | 0.847 |
K bits | ACG@10 | MAP@10 | ||||
---|---|---|---|---|---|---|
Sole | w/o | w/ | Sole | w/o | w/ | |
16 | 0.498 | 0.765 | 0.802 | 0.512 | 0.772 | 0.816 |
32 | 0.541 | 0.800 | 0.834 | 0.554 | 0.812 | 0.847 |
48 | 0.539 | 0.771 | 0.825 | 0.550 | 0.788 | 0.838 |
64 | 0.533 | 0.710 | 0.780 | 0.545 | 0.735 | 0.795 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ma, K.; Wang, B.; Li, Y.; Zhang, J. Image Retrieval for Local Architectural Heritage Recommendation Based on Deep Hashing. Buildings 2022, 12, 809. https://doi.org/10.3390/buildings12060809
Ma K, Wang B, Li Y, Zhang J. Image Retrieval for Local Architectural Heritage Recommendation Based on Deep Hashing. Buildings. 2022; 12(6):809. https://doi.org/10.3390/buildings12060809
Chicago/Turabian StyleMa, Kai, Bowen Wang, Yunqin Li, and Jiaxin Zhang. 2022. "Image Retrieval for Local Architectural Heritage Recommendation Based on Deep Hashing" Buildings 12, no. 6: 809. https://doi.org/10.3390/buildings12060809
APA StyleMa, K., Wang, B., Li, Y., & Zhang, J. (2022). Image Retrieval for Local Architectural Heritage Recommendation Based on Deep Hashing. Buildings, 12(6), 809. https://doi.org/10.3390/buildings12060809