Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval
Abstract
:1. Introduction
1.1. Retrieval of Artwork
1.2. Human Senses and Colors
1.3. Contributions of Our Work
- A new retrieval framework is proposed for the word-based color retrieval of artwork. The framework utilizes the cross-modal multi-task fine-tuning method on CLIP.
- We propose a new artwork color descriptor, and we project the color information into the text feature space to obtain a similar color based on human senses using a textural semantic space.
- We adapt a commonly used method IDF (inverse document frequency) to extract image color information, and we propose a label generation method for finding the rarest colors.
- A training data sampling method using a sketch structure is proposed, where images with the same structure can learn more similar feature vector representations.
- We apply the proposed method to retrieve ukiyo-e prints by two methods of color selection: direct main color retrieval using a color selection board and searching for the rare colors using text descriptions. By modifying the training setting, the main color can also be retrieved by the language description.
2. Related Work
2.1. Artwork Retrieval Related Work
2.2. Related Deep Learning Models
3. Methodology
3.1. The Space Sampler Module
3.1.1. Image Sketch Extraction
3.1.2. Histogram of Oriented Gradients (HOG) Feature Extraction
3.1.3. Triplet Data Sampling
3.2. Text and Label Generator
3.2.1. Color Information Extraction
3.2.2. IDF Calculation
3.3. Multi-Task Fine-Tuning on the CLIP Model
3.3.1. Cross-Modal Fine-Tuning with Cosine Similarity-Based Pairwise Loss
3.3.2. The Fine-Tuning Image Encoder with Triplet Loss
4. Experiments and Results
4.1. Datasets and Basic Experimental Setup
- (1)
- the CN dataset and corresponding RGB color cards, where the color cards were extracted according to their HEX code and used as input for image feature extraction, and
- (2)
- the 100 most recent color description data obtained from colornames.org [34], which provides a color naming interface. We also collected the color cards corresponding to these color descriptions but only used them for visualizing text embeddings. The collected data samples for evaluation are shown in Table 5.
4.2. Training Process
4.3. Evaluation Experiments Using Representations Extracted from the Fine-Tuned Text Encoder
4.3.1. Pearson Correlation Coefficient Calculation on the HCN and CN Datasets
4.3.2. Example of Color-Based Similar Description Retrieval by Using Features Extracted from Pre-Trained Language Models
4.4. Evaluation Experiments Using Representations Extracted from a Fine-Tuned Image Encoder
4.4.1. Pearson Correlation Coefficient Calculation on the CN Dataset and the Color Card Image Dataset
4.4.2. Similarity Score Calculation on the Same Ukiyo-e Print Digitized at the Different Institutions
4.4.3. Example of Comparison on Image-to-Image Retrieval
4.4.4. Example of Rare Color Retrieval
4.5. Demo Application Implementation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Serra, J.; Garcia, Á.; Torres, A.; Llopis, J. Color composition features in modern architecture. Color Res. Appl. 2012, 37, 126–133. [Google Scholar] [CrossRef]
- Mojsilovic, A. A computational model for color naming and describing color composition of images. IEEE Trans. Image Process. 2005, 14, 690–699. [Google Scholar] [CrossRef] [PubMed]
- Cotte, M.; Susini, J.; Metrich, N.; Moscato, A.; Gratziu, C.; Bertagnini, A.; Paganoet, M. Blackening of Pompeian cinnabar paintings: X-ray microspectroscopy analysis. Anal. Chem. 2006, 78, 7484–7492. [Google Scholar] [CrossRef] [PubMed]
- Stepanova, E. The impact of color palettes on the prices of paintings. Empir. Econ. 2019, 56, 755–773. [Google Scholar] [CrossRef] [Green Version]
- He, X.F.; Lv, X.G. From the color composition to the color psychology: Soft drink packaging in warm colors and spirits packaging in dark colors. Color Res. Appl. 2022, 47, 758–770. [Google Scholar] [CrossRef]
- Sasaki, S.; Webber, P. A study of dayflower blue used in ukiyo-e prints. Stud. Conserv. 2002, 47, 185–188. [Google Scholar] [CrossRef]
- Demo Application Implementation of Color based Ukiyo-e Print Retrieval. Available online: http://color2ukiyoe.net/ (accessed on 4 July 2022).
- Art Research Center, Ritsumeikan University. 2020. ARC Ukiyo-e Database, Informatics Research Data Repository, National Institute of Informatics. Available online: https://doi.org/10.32130/rdata.2.1 (accessed on 4 July 2022).
- Yelizaveta, M.; Tat-Seng, C.; Irina, A. Analysis and retrieval of paintings using artistic color concepts. In Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands, 6 June 2005; IEEE: Piscataway, NJ, USA, 2005. [Google Scholar]
- Smith, J.R.; Chang, S.-F. Tools and techniques for color image retrieval. In Storage and Retrieval for Still Image and Video Databases; International Society for Optics and Photonics: Bellingham, WA, USA, 1996; Chapter 4; Volume 2670. [Google Scholar]
- Collomosse, J.; Bui, T.; Wilber, M.J.; Fang, C.; Jin, H. Sketching with style: Visual search with sketches and aesthetic context. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Handpicked Color Names. Available online: https://github.com/meodai/color-names (accessed on 31 May 2022).
- Ranatunga, D.; Gadoci, B. Color-Names. Available online: https://data.world/dilumr/color-names (accessed on 31 May 2022).
- Newall, M. Painting with impossible colours: Some thoughts and observations on yellowish blue. Perception 2021, 50, 129–139. [Google Scholar] [CrossRef] [PubMed]
- Imgonline. Available online: https://www.imgonline.com.ua/eng/ (accessed on 6 June 2022).
- DeepAI: Image-Similarity Calculateor. Available online: https://deepai.org/machine-learning-model/image-similarity (accessed on 31 May 2022).
- Goodall, S.; Lewis, P.H.; Martinez, K.; Sinclair, P.A.S.; Giorgini, F.; Addis, M.J.; Boniface, M.J.; Lahanier, C.; Stevenson, J. SCULPTEUR: Multimedia retrieval for museums. In Proceedings of the International Conference on Image and Video Retrieval, Singapore, 21–23 July 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 638–646. [Google Scholar]
- Sharma, M.K.; Siddiqui, T.J. An ontology based framework for retrieval of museum artifacts. In Proceedings of the 7th International Conference on Intelligent Human Computer Interaction, Pilani, India, 12–13 December 2016; Elsevier: Amsterdam, The Netherlands, 2016; pp. 176–196. [Google Scholar]
- Falomir, Z.; Museros, L.; Sanz, I.; Gonzalez-Abril, L. Categorizing paintings in art styles based on qualitative color descriptors, quantitative global features and machine learning (QArt-Learn). Expert Syst. Appl. 2018, 97, 83–94. [Google Scholar] [CrossRef]
- Kim, N.; Choi, Y.; Hwang, S.; Kweon, I.S. Artrieval: Painting retrieval without expert knowledge. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1339–1343. [Google Scholar]
- Companioni-Brito, C.; Mariano-Calibjo, Z.; Elawady, M.; Yildirim, S. Mobile-based painting photo retrieval using combined features. In Proceedings of the International Conference Image Analysis and Recognition, Waterloo, ON, Canada, 27–29 August 2018; Springer: Cham, Switzerland, 2018; pp. 278–284. [Google Scholar]
- Lee, H.Y.; Lee, H.K.; Ha, Y.H. Spatial color descriptor for image retrieval and video segmentation. IEEE Trans. Multimed. 2003, 5, 358–367. [Google Scholar]
- Zhao, W.; Zhou, D.; Qiu, X.; Jiang, W. Compare the performance of the models in art classification. PLoS ONE 2021, 16, e0248414. [Google Scholar] [CrossRef] [PubMed]
- Wang, F.; Lin, S.; Luo, X.; Zhao, B.; Wang, R. Query-by-sketch image retrieval using homogeneous painting style characterization. J. Electron. Imaging 2019, 28, 023037. [Google Scholar] [CrossRef]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clarket, J. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning PLMR, Virtual Event, 13–14 August 2021. [Google Scholar]
- Conde, M.V.; Turgutlu, K. CLIP-Art: Contrastive pre-training for fine-grained art classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Nashville, TN, USA, 20–25 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 3956–3960. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Abdou, M.; Kulmizev, A.; Hershcovich, D.; Frank, S.; Pavlick, E.; Søgaard, A. Can language models encode perceptual structure without grounding? a case study in color. arXiv 2021, arXiv:2109.06129. [Google Scholar]
- Xiang, X.; Liu, D.; Yang, X.; Zhu, Y.; Shen, X.; Allebach, J.P. Adversarial open domain adaptation for sketch-to-photo synthesis. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 4–8 January 2022; IEEE: Piscataway, NJ, USA, 2021; pp. 1434–1444. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; pp. 886–893. [Google Scholar]
- Darosh. Colorgram. Available online: https://github.com/obskyr/colorgram.py (accessed on 31 May 2022).
- Hickey, G. The Ukiyo-e Blues: An Analysis of the Influence of Prussian Blue on Ukiyo-e in the 1830s. Master’s Thesis, The University of Melbourne, Melbourne, Australia, 1994. [Google Scholar]
- Hoffer, E.; Ailon, N. Deep metric learning using triplet network. In Proceedings of the International Workshop on Similarity-Based Pattern Recognition, Copenhagen, Denmark, 12–14 October 2015; Springer: Cham, Switzerland, 2015; pp. 84–92. [Google Scholar]
- Colornames.org. Available online: https://colornames.org/download/ (accessed on 31 May 2022).
- Kingma, D.P.; Ba, J.S. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Chen, C.F.R.; Fan, Q.; Panda, R. Crossvit: Cross-attention multi-scale vision transformer for image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 357–366. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Kubo, S. Butterflies around a wine jar.1818. Surimono, 203 × 277 cm, V&A Collection E136-1898. Photo: Courtesy of the Board of Trustees of Victoria & Albert Museum.
- Achlioptas, P.; Maks, O.; Haydarov, K.; Elhoseiny, M.; Guibas, L. Artemis: Affective language for visual art. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 11569–11579. [Google Scholar]
Image-Label Pair | Label: Anchor | Label: Positive | Label: Negative |
Totally Different | Overlap or Differ in Alphabetic Case | The Same | |
---|---|---|---|
Proportion | 721 (48%) | 570 (37%) | 223 (15%) |
For Training and Testing: HCN | For Evaluation: CN | Sample |
---|---|---|
Manganese Red | Amaranth | |
Hiroshima Aquamarine | Aquamarine | |
Wet Ash | Ash Gray |
Text Description | Color Card | |
---|---|---|
CN dataset and corresponding RGB color cards | Acid green (#b0bf1a) | |
Color description from colornames.org | Slurp Blue (#38e0dd) Pyonked (#c61fab) | (-) |
Pre-Trained Model | Pearson Correlation Coefficient Score |
---|---|
CLIP (ViT-B/32) | 0.7990 |
Fine-tunned CLIP (ours) | 0.7996 ↑ |
hexCode | Name | hexCode | Name |
---|---|---|---|
01ba8e | Technology Turquoise | dc13bb | Plurbin |
04196b | Bottle Cap Blue Juice | df1c8a | Punk Delilah |
0f1612 | Haha Its Not Black | e255f2 | Aortal |
11af1e | Its Definitely Not Pink | e5149a | Elle Woods Was Here |
16d54d | Grenitha | e5caeb | Light Salvia |
173838 | Fredoro | e8c251 | Luigi Yellow |
174855 | Business Blue | e98425 | True October |
176b76 | Dweebith | ea4835 | Piper Pizza Red |
20654d | Pompoono | eca741 | Yeach |
2388fb | Dragonbreaker147 | ef4bc3 | Violent Barbie Doll |
243a47 | Autumn Storm in the Mountains | f22966 | Vadelma |
260b84 | Gibbet | f5221b | Old Liquorish |
2d31c3 | Creeping Dusk | f57c7e | White Girl Sunburn Rash |
341e50 | Earthy Gearthy Purthy | fc7420 | Dinner in the Desert |
Pre-Trained Language Model | Query | Rank 1 | Rank 2 | Rank 3 | Rank 4 | Rank 5 |
---|---|---|---|---|---|---|
BERT-base-uncased | Electrica Violet | Olayan | Fredoro | Polar Purple | Luigi Yellow | Toxic Mermaid Tears |
BERT-base-multilingual-uncased | Red Purple | Luigi Yellow | Polar Purple | Vadelma | Light Salvia | |
Fine-tuned CLIP (ours) | Polar purple | Bright Light Eggplant Purple | Lavenviolet Crush | Could You Be Any More Purple | Oompa Lompie Purple | |
Pre-Trained Model | Pearson Correlation Coefficient Score |
---|---|
CLIP (ViT-B/32) | 0.2640 |
Fine-tunned CLIP (ours) | 0.2642 ↑ |
Input Pair | IMGonline [15] | DeepAI [16] |
---|---|---|
Input image (1,2) | 81.80% | 6 |
Input image (2,3) | 90.05% | 2 |
Input image (1,3) | 87.11% | 6 |
Model | Similarity Score | |||
---|---|---|---|---|
Image pairs | Image 1–Image 2 | Image 2–Image 3 | Image 1–Image 3 | Image 3-ukiyo-e print with similar structures and different colors |
Fine-tunned CLIP (ours) | 0.98256 | 0.987789 | 0.98914 | 0.897935 |
CLIP (ViT-B/32) | 0.98254 | 0.987763 | 0.98917 | 0.898041 |
546c94 | 5e6b96 | 7f8bb3 | 6b7ca4 | 6774a4 | 3c4476 |
Query | Rank 1 | Rank 2 | Rank 3 |
---|---|---|---|
Blue | |||
Dayflower blue |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, K.; Wang, J.; Batjargal, B.; Maeda, A. Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval. Future Internet 2022, 14, 212. https://doi.org/10.3390/fi14070212
Li K, Wang J, Batjargal B, Maeda A. Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval. Future Internet. 2022; 14(7):212. https://doi.org/10.3390/fi14070212
Chicago/Turabian StyleLi, Kangying, Jiayun Wang, Biligsaikhan Batjargal, and Akira Maeda. 2022. "Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval" Future Internet 14, no. 7: 212. https://doi.org/10.3390/fi14070212
APA StyleLi, K., Wang, J., Batjargal, B., & Maeda, A. (2022). Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval. Future Internet, 14(7), 212. https://doi.org/10.3390/fi14070212