Image Sampling Based on Dominant Color Component for Computer Vision
Abstract
:1. Introduction
- A gray feature map with the original resolution is extracted to retain the main structural information of the image, allowing for object boundary distinction. By reducing the image data depth to 8 bits instead of 24 bits, the data volume is significantly reduced by two-thirds compared to the original RGB images. Additionally, the retained boundary feature ensures minimal performance loss in computer vision tasks.
- A succinct color feature map is constructed using the dominant color components of pixels to capture the distinguishing properties of objects. Specifically, the index number of color channels with the largest value at each pixel is used to construct the color feature. The spatial resolution of this feature map is downsampled to effectively represent the color feature with minimal data.
- The proposed method generates compact compressed data through simple non-deep-learning computations. This results in a low-complexity and efficient method that can adapt to various tasks, such as image classification and object detection, without requiring modifications. The experimental results obtained demonstrate the efficiency of the proposed method in terms of computation complexity, compression ability, and generalization.
2. Related Work
2.1. Image Sampling for Human Eyes
2.2. Image PreProcessing for Computer Vision
3. Motivation
4. Method
4.1. Overview
4.1.1. Gray Imaging
4.1.2. Color Feature
4.2. Sampling Result Visualization
4.3. Cost Assessment
5. Experiments
5.1. Object Detection
5.1.1. mAP on Different Datasets
5.1.2. mAP under Different Detection Algorithms
5.1.3. mAP under Different Scales of Models
5.2. Image Classification
5.2.1. Accuracy on Different Datasets
5.2.2. Accuracy under Different Scales of Models
5.2.3. Accuracy under Different Classification Algorithms
5.3. Comparison with Other Image Sampling Methods
5.4. Ablation Experiments
5.4.1. Performance with Different Bits of Color Feature
5.4.2. Performance with Different Thresholds
5.4.3. Performance with Different Downsampling Factors
5.4.4. Performance with Different Downsampling Methods
5.4.5. Performance with Orders of Downsampling
5.4.6. Significance Test
5.4.7. Convergence Speed
5.5. Effect Visualization
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Jain, D.K.; Zhao, X.; González-Almagro, G.; Gan, C.; Kotecha, K. Multimodal pedestrian detection using metaheuristics with deep convolutional neural network in crowded scenes. Inf. Fusion 2023, 95, 401–414. [Google Scholar] [CrossRef]
- Zivkovic, M.; Bacanin, N.; Antonijevic, M.; Nikolic, B.; Kvascev, G.; Marjanovic, M.; Savanovic, N. Hybrid CNN and XGBoost Model Tuned by Modified Arithmetic Optimization Algorithm for COVID-19 Early Diagnostics from X-ray Images. Electronics 2022, 11, 3798. [Google Scholar] [CrossRef]
- Nyquist, H. Certain Topics in Telegraph Transmission Theory. Trans. Am. Inst. Electr. Eng. 1928, 47, 617–644. [Google Scholar] [CrossRef]
- Wallace, G. The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 1992, 38, xviii–xxxiv. [Google Scholar] [CrossRef]
- Cui, J.; Li, F.; Wang, L. Image Sampling for Machine Vision. In Proceedings of the CAAI International Conference on Artificial Intelligence, Beijing, China, 27–28 August 2022. [Google Scholar]
- Terzopoulos, D.; Vasilescu, M. Sampling and reconstruction with adaptive meshes. In Proceedings of the Computer Vision and Pattern Recognition, Maui, HI, USA, 3–6 June 1991; pp. 70–75. [Google Scholar]
- Eldar, Y.; Lindenbaum, M.; Porat, M.; Zeevi, Y. The farthest point strategy for progressive image sampling. IEEE Trans. Image Process. 1997, 6, 1305–1315. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ramoni, G.; Carrato, S. An adaptive irregular sampling algorithm and its application to image coding. Image Vis. Comput. 2001, 19, 451–460. [Google Scholar] [CrossRef]
- Wei, L.; Wang, R. Differential domain analysis for non-uniform sampling. ACM Trans. Graph. 2011, 30, 1–10. [Google Scholar]
- Marvasti, F.; Liu, C.; Adams, G. Analysis and recovery of multidimensional signals from irregular samples using nonlinear and iterative techniques. Signal Process 1994, 36, 13–30. [Google Scholar] [CrossRef]
- Devir, Z.; Lindenbaum, M. Blind adaptive sampling of images. IEEE Trans. Image Process. 2012, 21, 1478–1487. [Google Scholar] [CrossRef]
- Vipula, S.; Navin, R. Data Compression using non-uniform sampling, 2007. In Proceedings of the International Conference on Signal Processing, Chennai, India, 22–24 February 2007; pp. 603–607. [Google Scholar]
- Laurent, D.; Nira, D.; Armin, I. Image compression by linear splines over adaptive triangulations. Signal Process. 2006, 86, 1604–1616. [Google Scholar]
- Chen, W.; Ioth, S.; Shiki, J. Irregular sampling theorems for wavelet subspace. IEEE Trans. Inf. Theory 1998, 44, 1131–1142. [Google Scholar] [CrossRef]
- Liu, Y. Irregular sampling for spline wavelet. IEEE Trans. Inf. Theory 1996, 42, 623–627. [Google Scholar]
- Bahzad, S.; Nazanin, R. Model-based nonuniform compressive sampling and recovery of natural images utilizing a wavelet-domain universal hidden Markov model. IEEE Trans. Signal Process 2017, 65, 95–104. [Google Scholar]
- Lorenzo, P.; Lorenzo, G.; Pierre, V. Image compression using an edge adapted redundant dictionary and wavelets. Signal Process. 2006, 86, 444–456. [Google Scholar]
- Oztireli, A.; Alexa, M.; Gross, M. Spectral sampling of manifolds. AMC Trans. Graph. 2010, 29, 1–8. [Google Scholar] [CrossRef]
- Sochen, N.; Kimmel, R.; Malladi, R. A general framework for low level vision. IEEE Trans. Image Process. 1998, 7, 310–318. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cheng, S.; Dey, T.; Ramos, E. A manifold reconstruction from point samples. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Vancouver, BC, Canada, 23–25 January 2005; pp. 1018–1027. [Google Scholar]
- Saucan, E.; Appleboime, E.; Zeevi, Y. Geometric approach to sampling and communication. Sampl. Theory Signal Image Process. 2010, 11, 1–24. [Google Scholar] [CrossRef]
- Krishnamoorthi, R.; Seetharaman, K. Image compression based on a family of stochastic models. Signal Process. 2007, 87, 408–417. [Google Scholar] [CrossRef]
- Ji, S.; Xue, Y.; Lawrence, C. Bayesian compressive sensing. IEEE Trans. Signal Process 2008, 56, 2346–2356. [Google Scholar] [CrossRef]
- Matthew, M.; Robert, N. Near-optimal adaptive compressed sensing. IEEE Trans. Inf. Theory 2014, 60, 4001–4012. [Google Scholar]
- Ali, T.; Farokh, M. Adaptive Sparse Image Sampling and Recovery. IEEE Trans. Comput. Imaging 2018, 4, 311–325. [Google Scholar]
- Dai, Q.; Henry, C.; Emeline, P.; Oliver, C.; Marc, W.; Aggelos, K. Adaptive Image Sampling Using Deep Learning and Its Application on X-Ray Fluorescence Image Reconstruction. IEEE Trans. Multimed. 2020, 22, 2564–2578. [Google Scholar] [CrossRef] [Green Version]
- Wang, Z.; Li, F.; Xu, J.; Pamela, C. Human-Machine Interaction Oriented Image Coding for Resource-Constrained Visual Monitoring in IoT. IEEE Internet Things J. 2022, 9, 16181–16195. [Google Scholar] [CrossRef]
- Mei, Y.; Li, L.; Li, Z.; Li, F. Learning-Based Scalable Image Compression with Latent-Feature Reuse and Prediction. IEEE Trans. Multimed. 2022, 24, 4143–4157. [Google Scholar] [CrossRef]
- Muhammad, H.; Greg, S.; Norimichi, U. Task-Driven Super Resolution: Object Detection in Low-resolution Images. arXiv 2018, arXiv:1803.11316. [Google Scholar]
- Muhammad, W.; Bernhard, S.; Michael, H. The Unreasonable Effectiveness of Texture Transfer for Single Image Super-resolution. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 80–97. [Google Scholar]
- Maneet, S.; Shruti, N.; Richa, S.; Mayank, V. Dual Directed Capsule Network for Very Low Resolution Image Recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 340–349. [Google Scholar]
- Satoshi, S.; Motogiro, T.; Kazuya, H.; Takayuki, O.; Atsushi, S. Image Pre-Transformation for Recognition-Aware Image Compression, 2019. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–29 September 2019; pp. 2686–2690. [Google Scholar]
- Vivek, S.; Ali, D.; Davy, N.; Michael, B.; Luc, V.; Rainer, S. Classification Driven Dynamic Image Enhancement. In Proceedings of the IEEE/CVF Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4033–4041. [Google Scholar]
- Jonghwa, Y.; Kyung-Ah, S. Enhancing the Performance of Convolutional Neural Networks on Quality Degraded Datasets. In Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA), Sydney, Australia, 29 November–1 December 2017; pp. 1–8. [Google Scholar]
- Ren, K.; Gao, Y.; Wan, M.; Gu, G.; Chen, Q. Infrared small target detection via region super resolution generative adversarial network. Appl. Intell. 2022, 52, 11725–11737. [Google Scholar] [CrossRef]
- Veena, M.; Sowmya, K.; Uma, K.; Divyalakshmi, K.; Rajendra, A. An empirical study of preprocessing techniques with convolutional neural networks for accurate detection of chronic ocular diseases using fundus images. Appl. Intell. 2023, 53, 1548–1566. [Google Scholar]
- Chen, J.; Zeng, Z.; Zhang, R.; Wang, W.; Zheng, Y.; Tian, K. Adaptive illumination normalization via adaptive illumination preprocessing and modified weber-face. Appl. Intell. 2019, 49, 872–882. [Google Scholar] [CrossRef]
- Zhou, J.; Zhang, D.; Zhang, W. Underwater image enhancement method via multi-feature prior fusion. Appl. Intell. 2022, 52, 16435–16457. [Google Scholar] [CrossRef]
- Xu, X.; Zhan, W.; Zhu, D.; Jiang, Y.; Chen, Y.; Guo, J. Contour information-guided multi-scale feature detection method for visible-infrared pedestrian detection. Entropy 2023, 25, 1022. [Google Scholar] [CrossRef]
- Hossein, T.; Peyman, M. Learning to Resize Images for Computer Vision Tasks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–16 October 2021; pp. 487–496. [Google Scholar]
- Jia, D.; Wei, D.; Richard, S.; Li, L.; Kai, L.; Li, F. ImageNet: A large-scale hierarchical image database, 2009. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Chen, Z.; Bernard, G. ThumbNet: One Thumbnail Image Contains All You Need for Recognition, 2020. In Proceedings of the 28th ACM International Conference on Multimedia ACM, Seattle, WA, USA, 12–16 October 2020; pp. 1506–1514. [Google Scholar]
- Chen, T.; Lin, L.; Zuo, W.; Luo, X.; Zhang, L. Learning a Wavelet-like Auto-Encoder to Accelerate Deep Neural Networks, 2017. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 6722–6729. [Google Scholar]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. PointCNN: Convolution On X -Transformed Points. In Proceedings of the Advances in Neural Information Processing Systems (NIPS 2018), Montreal, QC, Canda, 3–8 December 2018; Volume 31. [Google Scholar]
- Qi, C.; Litany, O.; He, K.; Guibas, L. Deep Hough Voting for 3D Object Detection in Point Clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 october—2 November 2019; pp. 9276–9285. [Google Scholar]
- Lang, I.; Manor, A.; Avidan, S. SampleNet: Differentiable Point Cloud Sampling. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 7575–7585. [Google Scholar] [CrossRef]
- Huang, T.; Zhang, J.; Chen, J.; Liu, Y.; Liu, Y. Resolution-Free Point Cloud Sampling Network with Data Distillation. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–24 October 2022; pp. 54–70. [Google Scholar]
- Zhou, W.; Yang, Q.; Jiang, Q.; Zhai, G.; Lin, W. Blind Quality Assessment of 3D Dense Point Clouds with Structure Guided Resampling. arXiv 2022, arXiv:2208.14603. [Google Scholar]
- Yang, Z.; Qiu, Z.; Fu, D. DMIS: Dynamic Mesh-based Importance Sampling for Training Physics-Informed Neural Networks. arXiv 2022, arXiv:2211.13944. [Google Scholar] [CrossRef]
- Lin, T.; Marie, M.; Balongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollar, P.; Zitnick, L. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
- Itti, L.; Koch, C.; Niebur, E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 1254–1259. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Joseph, R.; Santosh, D.; Ross, G.; Ali, F. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Joseph, R.; Ali, F. YOLO9000: Better, faster, stronger, 2017. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
- Joseph, R.; Ali, F. YOLOv3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Gleen, J. YOLOv5. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 1 March 2023).
- Ren, S.; He, K.; Ross, G.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Zhou, X.; Koltun, V.; Krähenbühl, P. Tracking Objects as Points. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; pp. 474–490. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar] [CrossRef] [Green Version]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representation, San Diego, CA, USA, 7–9 May 2015; pp. 1–14. [Google Scholar]
- Huang, G.; Liu, Z.; Laurens, V.; Kilian, Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar]
Precision/% | Decrease Compared to RGB/% | |
---|---|---|
Numerical Change | 58.58 | −14.18 |
Ordinal Change | 51.81 | −20.94 |
Input | Bit/Pixel | FLOPs/Pixel | Time/ms |
---|---|---|---|
RGB | 24 | 0 | 5.24 |
Gray | 8 | 5 | 5.60 |
ISDCC | 8.5 | 10 | 6.55 |
Dataset | Input | [email protected]/% | [email protected]–0.95/% |
---|---|---|---|
VOC | RGB | 86.6 | 62.6 |
Gray | 85.6 | 60.8 | |
ISDCC | 85.8 | 61.4 | |
COCO | RGB | 55.4 | 36.7 |
Gray | 53.6 | 35.0 | |
ISDCC | 55.4 | 35.8 |
YOLOv5-m | Faster R-CNN [57] | CenterNet [58] | |
---|---|---|---|
RGB | 90.1 | 70.1 | 74.9 |
Gray | 89.2 | 65.4 | 72.4 |
ISDCC | 89.5 | 66.0 | 72.8 |
Increase | +0.3 | +0.6 | +0.4 |
Cifar100 | ImageNet | |
---|---|---|
RGB | 78.65 | 72.76 |
Gray | 71.44 | 70.95 |
ISDCC | 73.92 | 71.34 |
Increase | +2.48 | +0.39 |
MobileNet-v2 [59] | Vgg16 [60] | DenseNet121 [61] | |
---|---|---|---|
RGB | 68.61 | 72.38 | 79.24 |
Gray | 60.45 | 63.92 | 72.04 |
ISDCC | 64.76 | 67.81 | 75.62 |
Increase | +4.31 | +3.89 | +3.58 |
Dataset | Method | FLOPs/Pixel | Bit/Pixel | [email protected]/% | [email protected]–0.95/% |
---|---|---|---|---|---|
VOC | Y:U:V = 4:1:1 | 15 | 12 | 85.8 | 61.5 |
Y:U:V = 4:2:0 | 15 | 12 | 85.9 | 61.5 | |
Farthest | O(n2) | 8.5 | 85.9 | 61.5 | |
Random | O(n) | 8.5 | 85.9 | 61.5 | |
ISDCC | 10 | 8.5 | 85.8 | 61.4 | |
COCO | Y:U:V = 4:1:1 | 15 | 12 | 55.4 | 35.8 |
Y:U:V = 4:2:0 | 15 | 12 | 55.5 | 35.8 | |
Farthest | O(n2) | 8.5 | 51.9 | 33.0 | |
Random | O(n) | 8.5 | 49.6 | 31.3 | |
ISDCC | 10 | 8.5 | 55.4 | 35.8 |
Factor | FLOPs | [email protected]/% | [email protected]–0.95% |
---|---|---|---|
Left-top | 0 | 55.4 | 35.8 |
Right-down | 0 | 55.0 | 35.4 |
Max-pooling | 3 | 55.2 | 35.5 |
Min-pooling | 3 | 55.0 | 35.5 |
Average-pooling | 5 | 54.9 | 35.5 |
Downsample | [email protected]/% | [email protected]–0.95% |
---|---|---|
Gray Image | 54.9 | 35.3 |
Color Feature Map | 55.4 | 35.8 |
Method | RGB | Gray | Ours |
---|---|---|---|
RGB | -\- | 1\0 | 1\0 |
Gray | 0\1 | -\- | 0\1 |
Ours | 0\1 | 1\0 | -\- |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, S.; Cui, J.; Li, F.; Wang, L. Image Sampling Based on Dominant Color Component for Computer Vision. Electronics 2023, 12, 3360. https://doi.org/10.3390/electronics12153360
Wang S, Cui J, Li F, Wang L. Image Sampling Based on Dominant Color Component for Computer Vision. Electronics. 2023; 12(15):3360. https://doi.org/10.3390/electronics12153360
Chicago/Turabian StyleWang, Saisai, Jiashuai Cui, Fan Li, and Liejun Wang. 2023. "Image Sampling Based on Dominant Color Component for Computer Vision" Electronics 12, no. 15: 3360. https://doi.org/10.3390/electronics12153360
APA StyleWang, S., Cui, J., Li, F., & Wang, L. (2023). Image Sampling Based on Dominant Color Component for Computer Vision. Electronics, 12(15), 3360. https://doi.org/10.3390/electronics12153360