1. Introduction
The analysis of products on store shelves has been a research focus for decades [
1,
2,
3,
4,
5,
6,
7,
8].
Recognizing individual products at the SKU level can aid in analyzing product shelf share and planogram compliance while considering limited promotions or seasonal facings. It can also be used to personalize promotions or recommendations for individual customers. Additionally, such systems, which recognize products on store shelves based on the appearance of their packaging, can be designed to assist individuals with disabilities. For instance, they can offer specific conveniences for visually impaired people.
The research presented in this article focuses on recognizing product SKUs based on their packing facing, not barcodes. An SKU is a unique code used internally by retailers and e-commerce sellers to identify each product and its variation, often including details like style, size, color, collection, or packaging-facing version. In contrast, without detailed product information, barcodes are used externally across the retail supply chain to identify the manufacturer and product number. Changing a product’s packaging alters the SKU but not the barcode, which is crucial for tracking promotional campaign effectiveness, e.g., a line of drinks released for a sports event with athletes’ images on the cans. One barcode can have multiple SKUs, and one SKU can apply to products with different barcodes if the packaging is the same but the manufacturing location differs.
A significant challenge in this area is the diversity of products, the vast number of classes, frequent packaging changes, and seasonal rotations, all of which demand flexible and scalable solutions. Typically, automated SKU recognition involves two stages: detecting the products on the shelf and then recognizing them, while state-of-the-art detectors effectively detect products [
9], the recognition problem remains challenging [
10]. Using a neural network for classification requires preparing the network to recognize every possible class. Each class needs some representation in the form of images, and its quantity depends on many factors. However, when the number of classes is large, the size of the dataset significantly increases, often leading to huge datasets. This problem pertains to the food industry due to the continuous rotation of products. Consequently, classification typically relies on a two-step approach. The first step, based on a convolutional neural network, aims to generate a multidimensional feature vector that should be unique for each class. In the second step, the generated vector is compared with objects from the reference feature vector. However, creating a pattern set that accounts for different views of each product, varying lighting conditions, noise, or color temperature is nearly impossible. Thus, SKU recognition should be approached as a one-shot or few-shot learning problem, where the goal is to infer the class of a detected product based on one or a few prototype images. Given that every product launched on the market initially has a digital design of its label/facing or an e-commerce model used in online store listings, a recognition process based on such a single prototype would be groundbreaking.
To address these challenges, this paper explores the potential of one-shot learning, which relies on a single prototype image per class and is particularly suited for environments where data scarcity is the norm. We introduce and evaluate Variational Prototyping Encoder (VPE) architecture for classifying store-shelf products. The VPE architecture effectively handles domain discrepancies and data imbalances by utilizing pairs of prototype and real images [
11]. This approach facilitates the learning of latent feature space, where Variational Autoencoder (VAE) ensures that features of actual products are tightly clustered around the prototype features.
The main contributions of this paper can be summarized as follows:
A novel modification of the VPE algorithm was made, involving the incorporation of prototypes as a signal at the encoder input;
The modified VPE was adapted for product recognition on retail shelves;
The impact of data diversity and quality was analyzed, focusing on key aspects such as augmentation techniques, background uniformity, and optimal prototype selection;
A comprehensive optimization of parameters and techniques for the VPE was conducted. This included methods for stopping network training, distance metrics in the latent space, network architecture, and various implemented loss functions.
2. Related Works
One-shot learning stands out as a pivotal technique where a model is designed to acquire knowledge from a single example, contrasting sharply with traditional deep learning approaches that rely on extensive datasets. Pioneering efforts in this field, such as the work of Li et al., utilize a Bayesian strategy to harness latent and generic prior information, demonstrating that such learned priors can adapt effectively to various small-data problems, thereby alleviating issues of data imbalance and showing promising generalizability [
12]. Furthering these concepts, Lake et al. explored the generative processes using hierarchical Bayesian models, which proved capable of extending to new tasks with minimal data input [
13].
Recent strategies in one-shot learning have focused on embedding learning and meta-learning. Works by researchers [
14,
15] have advanced the field of metric learning by transforming task-related information into a metric space where classification occurs through the comparison of similarity scores. In contrast, approaches by [
16,
17] aim to imbue models with the ability to adapt to new tasks, aligning with meta-learning methodologies.
Chen et al. [
18] have extended prototype learning to one-shot image segmentation by incorporating multi-class label information during episodic training to generate more nuanced feature representations for each category. Prototypical Networks [
19] introduced an approach where classification in few-shot scenarios is facilitated by computing distances to class-centered prototypes, representing a simpler yet effective bias beneficial in limited-data conditions.
When addressing the challenges of retail shelf product recognition, Wang’s proposal of an enhanced Siamese neural network in one-shot learning is particularly noteworthy [
20]. This approach introduces a spatial channel dual attention mechanism aimed at refining the network architecture, significantly enhancing the network’s ability to focus on and interpret subtle product details.
On the generative modeling front, VAE, introduced by Kingma and Welling, is a generative model comprising encoder and de-coder networks [
21]. VAE encodes input data into a latent space and decodes it back to the original domain, facilitating tasks like image reconstruction and generation. VPE, a derivative of VAE, presented by Kim et al., specializes in the one-shot classification of graphic symbols, enabling categorization with a single prototype image per class [
11].
Recent research explores extensions like Variational Multi-Prototype Encoder (VaMPE) [
22] or Semi-Supervised Variational Prototyping Encoder (SS-VPE) [
23]. VaMPE utilizes multiple prototypes per class to enhance model performance without the need for additional sub-labeling. SS-VPE employs generative unsupervised learning to optimize prototypes in latent space, applies a Student’s-t mixture model for robust outlier management, and advances the VAE for enhanced few-shot semi-supervised learning performance. It is also worth mentioning the introduction of VPE++, which inherently reduces hubness and incorporates contrastive and multi-task losses to increase the discriminative ability of few-shot learning models [
24].
The evolving landscape of one-shot learning, prototype methods, and VAE-based approaches underscores the continuous efforts to address challenges in learning from limited data and improve the efficiency and effectiveness of machine learning models. These advancements hold promise for applications across various domains, including image recognition.
This paper focuses on employing one-shot learning techniques utilizing prototype SKU images. One-shot learning, which trains a model to recognize patterns or objects with a single example, makes it particularly suited for scenarios with limited data. Here, prototypes, representative examples of product categories, are utilized alongside unique SKU identifiers to develop a model capable of discerning various products from single instances.
5. Conclusions
This study has successfully implemented a VPE tailored to the problem of recognizing retail shelf items from a limited dataset based on product graphics prototypes, achieving satisfactory accuracy. The strategic addition of prototypes to each training set notably improved the recognition rate of unseen classes, indicating a substantial improvement in the algorithm’s ability to identify new classes without prior exposure.
These results suggest that the Cosine distance measure consistently outperforms the Euclidean measure across both evaluation methods, yielding higher recall values across all tested scenarios. Further comparisons showed that appropriately chosen and applied image sizes and augmentation techniques positively affect the algorithm’s performance. Random rotation and horizontal flipping were the only transformations that did not yield the anticipated outcomes for the analyzed dataset. This could be attributed to the products’ specific nature, such as beverages and dairy items. These items are typically placed on shelves in stores in a specific position and are rarely tilted or turned, as store staff ensure their proper arrangement. Consequently, the augmentation technique involving random rotation of images might have introduced unrealistic representations of these products.
Adding spatial transformers enhances the algorithm’s performance because it increases the network’s robustness and the network’s resistance to geometric deformations, allowing for better recognition of objects regardless of their orientation or position. As a result, the network can more accurately identify important features of beverages, such as labels and bottle shapes, while ignoring less relevant background elements.
No significant differences were observed among the tested loss functions, but all proved effective in optimizing the model’s performance, confirming their usefulness for complex problems involving VAEs. Uniform testing conditions for prototypes, such as consistent backgrounds and the selection of suitable prototypes, contributed to creating a cohesive assessment environment that yielded satisfactory effects.
The results revealed variations in recognition indicators across various product categories. Dairy products proved to be the most challenging compared to drinks and snacks. This can be attributed to the relatively smaller size of dairy products and significant identifying features on the front of the packaging and on the lids. Moreover, the labels of dairy products often utilize muted, similar colors.
The model can effortlessly distinguish visually similar products of the same brands with similar packaging, differing only in aspects such as flavor, as demonstrated for the first and second pairs in
Figure 8. However, the model is unable to differentiate between dairy products of the same type in different sizes based only on the prototype. It is worth mentioning that humans would also struggle to make this distinction based on images alone.
The model demonstrates high efficiency in reconstructing prototypes for classes seen during the training process. It performs almost flawlessly even when the images are blurred, have low resolution, are shadowed, or are only partially visible. Although generating prototypes for classes unseen during training is not as precise, it still reflects the key features of these classes from the input images. The model accurately handles high-level features such as the dominant color or shape of the packaging, and while detailed elements may not be precisely replicated, the locations of colors and shapes are approximately consistent with the actual products. The model exhibits particularly good abilities in reconstructing prototypes for classes, which was not directly seen during the training phase but learned through other variants of the product. The model can detect subtle differences and accurately reproduce features characteristic of a new flavor variant, not just those it already knows. As a result, even new and previously unknown variants can be represented with satisfactory accuracy. VPE implicitly assimilates knowledge on how to neutralize the real image against disturbances occurring in the real world and, to some extent, captures high-level prototype concepts for classes unseen during the training phase.
In conclusion, it is worth mentioning the limitations encountered during the research. The lack of availability of product collections from store shelves divided into product SKUs with their prototype meant that a large part of the design work consisted of obtaining and developing such datasets. This is a very time-consuming and laborious process, which undoubtedly limits the possibility of efficiently and extensively testing the solution in various scenarios.
Several promising avenues for further research and development can be identified to enhance the current approach. One such direction is the application of diffusion models to the studied problem. Diffusion models, also known as diffusion probabilistic models or score-based generative models, represent a class of latent variable generative models that have recently gained significant attention in the machine learning community [
28]. These models have been shown to outperform traditional methods, such as VAEs, in generating more accurate and robust latent spaces. By integrating diffusion models, a more precise representation of the underlying data structure may be achieved, leading to improved performance in the studied task. This could unlock new, innovative solutions and provide a deeper understanding of the complexities involved.
Additionally, the exploration of reranking methods presents another potential enhancement. Specifically, refining SKU recognition through a reranking technique that optimizes results based on the top-5 nearest neighbors could be highly effective. This reranking process could leverage extracted global features to reassess and reorder initial predictions, or it could generate new features by applying local feature detectors and descriptors such as SIFT [
29] or SURF [
30]. By doing so, the accuracy of SKU recognition could be significantly improved, especially in challenging scenarios.
A further challenge in SKU recognition arises when dealing with similar product facings for SKUs that differ only in size. Two products with identical packaging but different volumes can be difficult to distinguish. To address this issue, one could explore the analysis of the weight-to-height ratio of detected products as a distinguishing feature. Alternatively, a model could be trained to estimate the size of a product based on the gap space between shelves, providing additional context for accurate SKU identification. These approaches could mitigate the ambiguity in recognizing products with similar appearances, ultimately leading to more reliable SKU classification.
Author Contributions
Conceptualization, G.S.; methodology, A.K.; software, A.K.; validation, G.S.; formal analysis, G.S.; investigation, A.K.; resources, G.S.; data curation, G.S.; writing—original draft preparation, A.K.; writing—review and editing, G.S.; visualization, A.K.; supervision, G.S.; project administration, G.S.; funding acquisition, G.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research was co-founded by the National Center for Research and Development under Subtask 1.1.1 of the Smart Growth Operational Program 2014–2020, co-financed from public funds of the Regional Development Fund No. 2014/2020 under grant no. POIR.01.01.01-00-2326/20-00.
Data Availability Statement
The data presented in this study are available on request from the corresponding author due to (commercial use of the data). In the future, it is planned to make the collection publicly available with a request to cite this paper if it is used for research purposes.
Acknowledgments
A big thanks to the Omniaz mapping team and all those who ensured the quality of the data provided for the experiments.
Conflicts of Interest
Author Grzegorz Sarwas is currently employed at the Warsaw University of Technology and the company Omniaz Sp. z o.o. The remaining author (Aleksandra Kowalczyk) declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
- Merler, M.; Galleguillos, C.; Belongie, S. Recognizing Groceries in situ Using in vitro Training Data. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar] [CrossRef]
- George, M.; Mircic, D.; Sörös, G.; Floerkemeier, C.; Mattern, F. Fine-Grained Product Class Recognition for Assisted Shopping. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), Santiago, Chile, 7–13 December 2015; pp. 546–554. [Google Scholar] [CrossRef]
- Melek, C.G.; Sonmez, E.B.; Albayrak, S. A survey of product recognition in shelf images. In Proceedings of the 2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Turkey, 5–8 October 2017; pp. 145–150. [Google Scholar] [CrossRef]
- Tonioni, A.; Serra, E.; Di Stefano, L. A deep learning pipeline for product recognition on store shelves. In Proceedings of the 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS), Sophia Antipolis, France, 12–14 December 2018; pp. 25–31. [Google Scholar] [CrossRef]
- Geng, W.; Han, F.; Lin, J.; Zhu, L.; Bai, J.; Wang, S.; He, L.; Xiao, Q.; Lai, Z. Fine-Grained Grocery Product Recognition by One-Shot Learning. In Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Republic of Korea, 22–26 October 2018; MM’18. pp. 1706–1714. [Google Scholar] [CrossRef]
- Leo, M.; Carcagnì, P.; Distante, C. A Systematic Investigation on end-to-end Deep Recognition of Grocery Products in the Wild. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 7234–7241. [Google Scholar] [CrossRef]
- Chen, S.; Liu, D.; Pu, Y.; Zhong, Y. Advances in deep learning-based image recognition of product packaging. Image Vis. Comput. 2022, 128, 104571. [Google Scholar] [CrossRef]
- Selvam, P.; Faheem, M.; Dakshinamurthi, V.; Nevgi, A.; Bhuvaneswari, R.; Deepak, K.; Abraham Sundar, J. Batch Normalization Free Rigorous Feature Flow Neural Network for Grocery Product Recognition. IEEE Access 2024, 12, 68364–68381. [Google Scholar] [CrossRef]
- Goldman, E.; Herzig, R.; Eisenschtat, A.; Goldberger, J.; Hassner, T. Precise Detection in Densely Packed Scenes. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5222–5231. [Google Scholar] [CrossRef]
- Melek, C.G.; Battini Sönmez, E.; Varlı, S. Datasets and methods of product recognition on grocery shelf images using computer vision and machine learning approaches: An exhaustive literature review. Eng. Appl. Artif. Intell. 2024, 133, 108452. [Google Scholar] [CrossRef]
- Kim, J.; Oh, T.H.; Lee, S.; Pan, F.; Kweon, I.S. Variational Prototyping-Encoder: One-Shot Learning With Prototypical Images. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 9454–9462. [Google Scholar] [CrossRef]
- Fe-Fei, L.; Fergus; Perona. A Bayesian approach to unsupervised one-shot learning of object categories. In Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France, 13–16 October 2003; Volume 2, pp. 1134–1141. [Google Scholar] [CrossRef]
- Lake, B.M.; Salakhutdinov, R.; Tenenbaum, J.B. Human-level concept learning through probabilistic program induction. Science 2015, 350, 1332–1338. [Google Scholar] [CrossRef] [PubMed]
- Vinyals, O.; Blundell, C.; Lillicrap, T.; kavukcuoglu, k.; Wierstra, D. Matching Networks for One Shot Learning. In Proceedings of the Advances in Neural Information Processing Systems; Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R., Eds.; Curran Associates, Inc.: Nice, France, 2016; Volume 29. [Google Scholar]
- Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; Hospedales, T.M. Learning to Compare: Relation Network for Few-Shot Learning. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1199–1208. [Google Scholar] [CrossRef]
- Zhenguo, L.; Fengwei, Z.; Fei, C.; Hang, L. Meta-SGD: Learning to Learn Quickly for Few Shot Learning. arXiv 2017, arXiv:1707.09835. [Google Scholar]
- Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning–Volume 70. JMLR.org, Sydney, Australia, 6–11 August 2017; Volume 10, pp. 1126–1135. [Google Scholar]
- Chen, T.; Xie, G.S.; Yao, Y.; Wang, Q.; Shen, F.; Tang, Z.; Zhang, J. Semantically Meaningful Class Prototype Learning for One-Shot Image Segmentation. IEEE Trans. Multimed. 2022, 24, 968–980. [Google Scholar] [CrossRef]
- Snell, J.; Swersky, K.; Zemel, R. Prototypical Networks for Few-shot Learning. In Proceedings of the Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Nice, France, 2017; Volume 30. [Google Scholar]
- Wang, C.; Huang, C.; Zhu, X.; Zhao, L. One-Shot Retail Product Identification Based on Improved Siamese Neural Networks. Circuits, Syst. Signal Process. 2022, 41, 1–15. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014; Conference Track Proceedings. Available online: https://arxiv.org/abs/1312.6114v11 (accessed on 1 July 2024). Conference Track Proceedings.
- Kang, J.S.; Ahn, S.C. Variational Multi-Prototype Encoder for Object Recognition Using Multiple Prototype Images. IEEE Access 2022, 10, 19586–19598. [Google Scholar] [CrossRef]
- Liu, Y.; Shi, D. SS-VPE: Semi-Supervised Variational Prototyping Encoder With Student’s-t Mixture Model. IEEE Trans. Instrum. Meas. 2023, 72, 1–9. [Google Scholar] [CrossRef]
- Xiao, C.; Madapana, N.; Wachs, J. One-Shot Image Recognition Using Prototypical Encoders with Reduced Hubness. In Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; pp. 2251–2260. [Google Scholar] [CrossRef]
- Panchal, S. Implementation and Comparative Quantitative Assessment of Different Multispectral Image Pansharpening Approaches. Signal Image Process. Int. J. 2015, 6, 35. [Google Scholar] [CrossRef]
- Bansal, A.; Singhrova, A. Performance Analysis of Supervised Machine Learning Algorithms for Diabetes and Breast Cancer Dataset. In Proceedings of the 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, 25–27 March 2021; pp. 137–143. [Google Scholar] [CrossRef]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–3 October 2023; pp. 4015–4026. [Google Scholar]
- Hu, R.; Hu, W.; Li, J. Saliency Driven Nonlinear Diffusion Filtering for Object Recognition. In Proceedings of the 2013 2nd IAPR Asian Conference on Pattern Recognition, Naha, Japan, 5–8 November 2013; pp. 381–385. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
Figure 1.
Illustration of the training phase of the VPE.
Figure 2.
Illustration of the testing phase of the VPE.
Figure 3.
Examples of different prototypes for one product obtained by rotating the can, highlighting different features of the product.
Figure 4.
T-SNE visualization of features for the beverages product test dataset. Black crosses mark the prototypes of each class from the test set. Additionally, red dots mark the prototypes belonging to the classes defined as seen, meaning those from which the model was familiarized in the training process.
Figure 5.
Test class prototypes from phase one of the beverages.
Figure 6.
T-SNE visualization of features for the dairy product test dataset. Clusters were marked for selected classes whose prototypes were seen in the training phase.
Figure 7.
The obtained prototype reconstructions for the test set of dairy products compared with real prototypes and real photos divided into classes whose prototypes were seen in the learning process and classes not seen in the model training phase.
Figure 8.
Examples of challenges in recognizing similar products. The first pair represents the challenge of recognizing different flavor variants of a given product, and the second and third pair illustrates the challenge of recognizing different sizes of the same product.
Table 1.
Comparison of recall metrics under different evaluation methods and distance measures after a certain number of epochs are reached or validation accuracy is achieved.
Distance | Method | Recall | Top-nn |
---|
All | Train | Test | 2-nn | 3-nn |
---|
Euclidean | Reach defined number of epochs | 0.888 | 0.894 | 0.883 | 0.972 | 0.986 |
Trigger after validation accuracy is achieved | 0.769 | 0.939 | 0.623 | 0.825 | 0.839 |
Cosine | Reach defined number of epochs | 0.916 | 0.909 | 0.922 | 0.986 | 0.993 |
Trigger after validation accuracy is achieved | 0.888 | 0.955 | 0.831 | 0.986 | 0.986 |
Table 2.
One-shot classification recall for different image sizes and algorithm versions, which includes different combinations of spatial transformer, augmentation, and separately treated rotations.
Image Size | Algorithm’s Version | One-Shot Classification Recall (%) |
---|
Classes Seen | Classes Unseen |
---|
| VPE | 0.939 | 0.713 |
VPE + aug | 0.939 | 0.896 |
VPE + aug + rotate | 0.576 | 0.818 |
VPE + stn | 0.939 | 0.948 |
VPE + aug + stn | 0.955 | 0.896 |
| VPE | 0.924 | 0.740 |
VPE + aug | 0.970 | 0.909 |
VPE + aug + rotate | 0.712 | 0.909 |
VPE + stn | 0.939 | 0.935 |
VPE + aug + stn | 0.909 | 0.922 |
Table 3.
One-shot classification recall for different loss functions.
Loss Function | One-Shot Classification Recall (%) |
---|
Classes Seen | Classes Unseen |
---|
BCE + KLD | 0.970 | 0.949 |
RMSE | 0.970 | 0.949 |
ERGAS | 0.939 | 0.970 |
CC | 0.955 | 0.949 |
RASE | 0.924 | 0.929 |
Table 4.
Summary of classification metrics such as recall, precision, and F1-score for seen and unseen classes. The model was familiarized with the classes referred to as seen during the training process, although images in this subset were never seen by the model. In turn, the unseen classes are those that are completely new to the model and were not used in the training phase.
Classes | Recall | Precision | F1-Score |
---|
| Seen |
Class 1, Black Energy, ultra mango, can, orange | 1.000 | 1.000 | 1.000 |
Class 2, Coca-cola, bottle | 1.000 | 1.000 | 1.000 |
Class 8, Easy boost, zero sugar, cherry, can, pink | 1.000 | 0.917 | 0.957 |
Class 9, Easy boost, blueberry and lime, can, purple | 1.000 | 1.000 | 1.000 |
Class 10, Level up Classic Energy Drink, can, blue | 1.000 | 1.000 | 1.000 |
Class 11, Dzik, tropic, can, green | 1.000 | 1.000 | 1.000 |
| Unseen |
Class 0, Black Energy, zero sugar, paradise, can, light-blue | 1.000 | 1.000 | 1.000 |
Class 3, Tiger Pure, passion fruit-lemon, can, light-yellow | 0.909 | 0.909 | 0.909 |
Class 4, Tiger Hyper Splash, exotic, can, pink | 0.909 | 0.909 | 0.909 |
Class 5, Black Energy, ultra mojito, can, green | 1.000 | 1.000 | 1.000 |
Class 6, Red Bull Purple Edition, sugarfree, açai, can, purple | 0.909 | 1.000 | 0.952 |
Class 7, Lipton Ice Tea, lemon, bottle | 1.000 | 1.000 | 1.000 |
Class 12, Oshee Isotonic Drink, multifruit, narrow bottle, blue | 1.000 | 1.000 | 1.000 |
Class 13, Oshee Vitamin Water, lemon-orange, bottle, blue | 1.000 | 1.000 | 1.000 |
Table 5.
One-shot classification recall for various categories of food products for the second phase of research, which are beverages, dairy, and snacks.
Category | One-Shot Classification Recall (%) |
---|
Classes Seen | Classes Unseen |
---|
Beverages | 0.939 | 0.725 |
Dairy | 0.924 | 0.613 |
Snacks | 0.954 | 0.754 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).