Generalized Zero-Shot Image Classification via Partially-Shared Multi-Task Representation Learning
Abstract
:1. Introduction
- We describe a novel perspective grounded in multi-task learning, which reveals that existing methods exhibit an inherent generalization weakness of losing some transferable visual features.
- We propose a novel GZSL method, termed partially-shared multi-task representation learning network (PS-GZSL), to jointly preserve complementary and transferable information between discriminative and semantic-relevant features
- Extensive experiments on five widely-used GZSL benchmark datasets validate the effectiveness of our PS-GZSL and show that the joint contributions of the task-shared and task-specific representations result in more transferability representation.
2. Related Works
3. Methods
3.1. Problem Definition
3.2. Task-Shared and Task-Specific Representations
- Discriminative and Semantic-relevant Representations. Firstly, we define task-shared discriminative and semantic-relevant representations to encode the discriminative features of images that are related to corresponding semantic descriptors. These visual features are used for the both discrimination task and the visual-semantic alignment task during the training phase.
- Discriminative but Non-semantic Representations. Secondly, discriminative but non-semantic features are encoded in discrimination task-specific representations, denoted as . These features are important for discrimination, but they may not contribute to the visual-semantic alignment task since not represented in the semantic descriptors.
- Non-Discriminative but Semantic-relevant Representations. Finally, non-discriminative but semantic-relevant features are encoded in visual-semantic alignment task-specific representations, denoted as . These features are not discriminative in seen classes but may be critical for recognizing unseen classes. Thus, these features only contribute to the visual-semantic alignment task during training.
3.3. Representation Learning
3.3.1. Mixture-of-Experts
3.3.2. Instance Contrastive Discrimination Task
3.3.3. Relation-Based Visual-Semantic Alignment Task
3.4. Feature Generation with Latent Feedback
3.5. Training and Inference
- Training feature generation and representation learning models based on Equation (8).
- These learned models are then used to synthesize and extract unseen class representations .
- Using real visual samples x from seen classes for training the partially-shared representation learning part and synthesized visual samples for tuning generator.
- The final generalized zero-shot classifier is a single layer linear softmax classifier, learned on and c (extracted from real seen x and synthesized samples ), as depicted in Figure 6.
4. Experiments
4.1. Datasets
4.2. Metrics
4.3. Implementation Details
4.4. Comparison with State-of-the-Arts
4.5. Ablation Studies
4.5.1. t-SNE Visualization
4.5.2. Effectiveness of Task-Shared & Task-Specific Representations
4.5.3. Analysis of Model Components
4.6. Hyper-Parameter Analysis
- Visualization of Different Number of Synthesized Samples. The number of synthesized samples per class was varied, as shown in Figure 9. The results show that the performance on all four datasets increased with an increasing number of synthesized examples. This demonstrated that the bias towards seen problems was relieved by the feature generation in our PS-GZSL. However, generating too many samples will impair the accuracy of seen classes () and eventually hamper the harmonic mean . Therefore, selecting an appropriate value to achieve the balance between and is important.
- Visualization of Different Number of Experts. Since we use MoE modules for each branch, the architecture of the expert network is very important for our method. As shown in Figure 10, we study different numbers of experts for task-specific and task-shared, noted as num_sp and num_sh, respectively. As the numbers of task-specific experts and task-shared experts increase, the harmonic mean is boosted and then drops, which achieves the peek performance when num_sp = 3 and num_sh = 3. Thus, for convenience, both num_sp and num_sh are set to 3 in order to achieve a considerable performance in all of the remaining datasets.
- Visualization of Different Representations Dimensions. Intuitively, the dimensions , , and will have a significant impact on the optimization of these two sub-tasks. This will ultimately affect the transferability and expressiveness of the concatenated final representations. To explore the sensitivity of our PS-GZSL to the dimensionality in the latent space. As shown in Figure 11, the harmonic mean accuracy of PS-GZSL for different latent dimensions on AWA2 and CUB, i.e., 256, 512, 1024, and 2048 for both task-specific and task-shared representations(denoted as spSize and shSize, respectively) are represented. As spSize and shSize are both set to 1024, PS-GZSL consistently performs better than all others on AWA2 and CUB. Therefore, both spSize and shSize are set to 1024 in all of the remaining datasets.
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ZSL | Zero-Shot Learning |
GZSL | Generalized Zero-Shot Learning |
SupCon | Supervised Contrastive |
MoE | Mixture-of-Experts |
PS | Partially-Shared mechanism |
References
- Xian, Y.; Schiele, B.; Akata, Z. Zero-shot learning-the good, the bad and the ugly. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4582–4591. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Lampert, C.H.; Nickisch, H.; Harmeling, S. Learning to detect unseen object classes by between-class attribute transfer. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 951–958. [Google Scholar]
- Palatucci, M.; Pomerleau, D.; Hinton, G.E.; Mitchell, T.M. Zero-shot learning with semantic output codes. Adv. Neural Inf. Process. Syst. 2009, 22, 1410–1418. [Google Scholar]
- Chao, W.L.; Changpinyo, S.; Gong, B.; Sha, F. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016, Proceedings, Part II 14; Springer: Cham, Switzerland, 2016; pp. 52–68. [Google Scholar]
- Saad, E.; Paprzycki, M.; Ganzha, M.; Bădică, A.; Bădică, C.; Fidanova, S.; Lirkov, I.; Ivanović, M. Generalized Zero-Shot Learning for Image Classification—Comparing Performance of Popular Approaches. Information 2022, 13, 561. [Google Scholar] [CrossRef]
- Li, X.; Xu, Z.; Wei, K.; Deng, C. Generalized zero-shot learning via disentangled representation. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 1966–1974. [Google Scholar]
- Chen, L.; Zhang, H.; Xiao, J.; Liu, W.; Chang, S.F. Zero-shot visual recognition using semantics-preserving adversarial embedding networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1043–1052. [Google Scholar]
- Schonfeld, E.; Ebrahimi, S.; Sinha, S.; Darrell, T.; Akata, Z. Generalized zero-and few-shot learning via aligned variational autoencoders. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–17 June 2019; pp. 8247–8255. [Google Scholar]
- Chen, Z.; Luo, Y.; Qiu, R.; Wang, S.; Huang, Z.; Li, J.; Zhang, Z. Semantics disentangling for generalized zero-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 8712–8720. [Google Scholar]
- Tong, B.; Wang, C.; Klinkigt, M.; Kobayashi, Y.; Nonaka, Y. Hierarchical disentanglement of discriminative latent features for zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–17 June 2019; pp. 11467–11476. [Google Scholar]
- Chou, Y.Y.; Lin, H.T.; Liu, T.L. Adaptive and generative zero-shot learning. In Proceedings of the International Conference on Learning Representations, Virtual Event, 3–7 May 2021. [Google Scholar]
- Han, Z.; Fu, Z.; Chen, S.; Yang, J. Contrastive embedding for generalized zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference, 19–25 June 2021; pp. 2371–2381. [Google Scholar]
- Chen, S.; Wang, W.; Xia, B.; Peng, Q.; You, X.; Zheng, F.; Shao, L. Free: Feature refinement for generalized zero-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 122–131. [Google Scholar]
- Bui, M.H.; Tran, T.; Tran, A.; Phung, D. Exploiting domain-specific features to enhance domain generalization. Adv. Neural Inf. Process. Syst. 2021, 34, 21189–21201. [Google Scholar]
- Milbich, T.; Roth, K.; Bharadhwaj, H.; Sinha, S.; Bengio, Y.; Ommer, B.; Cohen, J.P. Diva: Diverse visual feature aggregation for deep metric learning. In Proceedings of the Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020, Proceedings, Part VIII 16; Springer: Cham, Switzerland, 2020; pp. 590–607. [Google Scholar]
- Ma, J.; Zhao, Z.; Yi, X.; Chen, J.; Hong, L.; Chi, E.H. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 1930–1939. [Google Scholar]
- Jacobs, R.A.; Jordan, M.I.; Nowlan, S.J.; Hinton, G.E. Adaptive mixtures of local experts. Neural Comput. 1991, 3, 79–87. [Google Scholar] [CrossRef] [PubMed]
- Khosla, P.; Teterwak, P.; Wang, C.; Sarna, A.; Tian, Y.; Isola, P.; Maschinot, A.; Liu, C.; Krishnan, D. Supervised contrastive learning. Adv. Neural Inf. Process. Syst. 2020, 33, 18661–18673. [Google Scholar]
- Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; Hospedales, T.M. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1199–1208. [Google Scholar]
- Xian, Y.; Lorenz, T.; Schiele, B.; Akata, Z. Feature generating networks for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 5542–5551. [Google Scholar]
- Frome, A.; Corrado, G.S.; Shlens, J.; Bengio, S.; Dean, J.; Ranzato, M.; Mikolov, T. Devise: A deep visual-semantic embedding model. Adv. Neural Inf. Process. Syst. 2013, 26, 2121–2129. [Google Scholar]
- Akata, Z.; Reed, S.; Walter, D.; Lee, H.; Schiele, B. Evaluation of output embeddings for fine-grained image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2927–2936. [Google Scholar]
- Akata, Z.; Perronnin, F.; Harchaoui, Z.; Schmid, C. Label-embedding for attribute-based classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 819–826. [Google Scholar]
- Romera-Paredes, B.; Torr, P. An embarrassingly simple approach to zero-shot learning. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 2152–2161. [Google Scholar]
- Liu, S.; Long, M.; Wang, J.; Jordan, M.I. Generalized zero-shot learning with deep calibration network. Adv. Neural Inf. Process. Syst. 2018, 31, 2009–2019. [Google Scholar]
- Yang, G.; Han, A.; Liu, X.; Liu, Y.; Wei, T.; Zhang, Z. Enhancing Semantic-Consistent Features and Transforming Discriminative Features for Generalized Zero-Shot Classifications. Appl. Sci. 2022, 12, 12642. [Google Scholar] [CrossRef]
- Li, J.; Jing, M.; Lu, K.; Ding, Z.; Zhu, L.; Huang, Z. Leveraging the invariant side of generative zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7402–7411. [Google Scholar]
- Xian, Y.; Sharma, S.; Schiele, B.; Akata, Z. f-vaegan-d2: A feature generating framework for any-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 10275–10284. [Google Scholar]
- Felix, R.; Reid, I.; Carneiro, G. Multi-modal cycle-consistent generalized zero-shot learning. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 21–37. [Google Scholar]
- Vyas, M.R.; Venkateswara, H.; Panchanathan, S. Leveraging seen and unseen semantic relationships for generative zero-shot learning. In Proceedings of the Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020, Proceedings, Part XXX 16; Springer: Cham, Switzerland, 2020; pp. 70–86. [Google Scholar]
- Li, Z.; Zhang, D.; Wang, Y.; Lin, D.; Zhang, J. Generative Adversarial Networks for Zero-Shot Remote Sensing Scene Classification. Appl. Sci. 2022, 12, 3760. [Google Scholar] [CrossRef]
- Sohn, K.; Lee, H.; Yan, X. Learning structured output representation using deep conditional generative models. Adv. Neural Inf. Process. Syst. 2015, 28, 3483–3491. [Google Scholar]
- Verma, V.K.; Arora, G.; Mishra, A.; Rai, P. Generalized zero-shot learning via synthesized examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4281–4289. [Google Scholar]
- Kim, J.; Shim, K.; Shim, B. Semantic feature extraction for generalized zero-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 22 February–1 March 2022; Volume 36, pp. 1166–1173. [Google Scholar]
- Tang, H.; Liu, J.; Zhao, M.; Gong, X. Progressive layered extraction (ple): A novel multi-task learning (mtl) model for personalized recommendations. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual Event, 22–26 September 2020; pp. 269–278. [Google Scholar]
- Park, H.; Yeo, J.; Wang, G.; Hwang, S.W. Soft representation learning for sparse transfer. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1560–1568. [Google Scholar]
- Xin, S.; Jiao, Y.; Long, C.; Wang, Y.; Wang, X.; Yang, S.; Liu, J.; Zhang, J. Prototype Feature Extraction for Multi-task Learning. In Proceedings of the ACM Web Conference 2022, Lyon, France, 25–29 April 2022; pp. 2472–2481. [Google Scholar]
- Narayan, S.; Gupta, A.; Khan, F.S.; Snoek, C.G.; Shao, L. Latent embedding feedback and discriminative features for zero-shot classification. In Proceedings of the Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020, Proceedings, Part XXII 16; Springer: Cham, Switzerland, 2020; pp. 479–495. [Google Scholar]
- Wah, C.; Branson, S.; Welinder, P.; Perona, P.; Belongie, S. The Caltech-Ucsd Birds-200-2011 Dataset. 2011. Available online: https://authors.library.caltech.edu/27452/ (accessed on 29 March 2023).
- Nilsback, M.E.; Zisserman, A. Automated flower classification over a large number of classes. In Proceedings of the 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, Bhubaneswar, India, 16–19 December 2008; pp. 722–729. [Google Scholar]
- Patterson, G.; Hays, J. Sun attribute database: Discovering, annotating, and recognizing scene attributes. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 2751–2758. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Reed, S.; Akata, Z.; Lee, H.; Schiele, B. Learning deep representations of fine-grained visual descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 49–58. [Google Scholar]
- Jiang, H.; Wang, R.; Shan, S.; Chen, X. Transferable contrastive network for generalized zero-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9765–9774. [Google Scholar]
- Min, S.; Yao, H.; Xie, H.; Wang, C.; Zha, Z.J.; Zhang, Y. Domain-aware visual bias eliminating for generalized zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 12664–12673. [Google Scholar]
- Li, K.; Min, M.R.; Fu, Y. Rethinking zero-shot learning: A conditional visual classification perspective. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3583–3592. [Google Scholar]
Model Comparison | Task-Shared | Task-Specific | |
---|---|---|---|
◯ | □ | △ | |
SP-AEN [8] | √ | √ | |
CADA-VAE [9] | √ | ||
SDGZSL [10] | √ | √ | |
DLFZRL [11] | √ | √ | |
DR-GZSL [7] | √ | ||
CE-GZSL [13] | √ | ||
Our PS-GZSL | √ | √ | √ |
Dataset | AWA1 | AWA2 | CUB | FLO | SUN |
---|---|---|---|---|---|
#Seen Classes | 40 | 40 | 150 | 82 | 645 |
#Unseen Classes | 10 | 10 | 50 | 20 | 72 |
#Samples | 30,475 | 37,322 | 11,788 | 8189 | 14,340 |
#Semantic Descriptors 1 | 85 | 85 | 1024 | 1024 | 102 |
#Training Samples | 19,832 | 23,527 | 7057 | 5394 | 10,320 |
#Test Seen Samples | 4958 | 5882 | 1764 | 1640 | 2580 |
#Test Unseen Samples | 5685 | 7913 | 2967 | 1155 | 1440 |
Methods | AWA1 | AWA2 | CUB | FLO | SUN | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
U | S | H | U | S | H | U | S | H | U | S | H | U | S | H | |
DeViSE [22] | 13.4 | 68.7 | 22.4 | 17.1 | 74.7 | 27.8 | 23.8 | 53.0 | 32.8 | 9.9 | 44.2 | 16.2 | 16.9 | 27.4 | 20.9 |
TCN [46] | 49.4 | 76.5 | 60.0 | 61.2 | 65.8 | 63.4 | 52.6 | 52.0 | 52.3 | - | - | - | 31.2 | 37.3 | 34.0 |
DVBE [47] | - | - | - | 63.6 | 70.8 | 67.0 | 53.2 | 60.2 | 56.5 | - | - | - | 45.0 | 37.2 | 40.7 |
f-CLSWGAN [21] | 57.9 | 64.0 | 60.2 | - | - | - | 43.7 | 57.7 | 49.7 | 59.0 | 73.8 | 65.6 | 42.6 | 36.6 | 39.4 |
CADA-VAE [9] | 57.3 | 72.8 | 64.1 | 55.8 | 75.0 | 63.9 | 51.6 | 53.5 | 52.4 | - | - | - | 47.2 | 35.7 | 40.6 |
SP-AEN [8] | - | - | - | 23.3 | 90.9 | 37.1 | 34.7 | 70.6 | 46.6 | - | - | - | 24.9 | 38.6 | 30.3 |
LisGAN [28] | 52.6 | 76.3 | 62.3 | - | - | - | 46.5 | 57.9 | 51.6 | 57.7 | 83.8 | 68.3 | 42.9 | 37.8 | 40.2 |
cycle-CLSWGAN [30] | 56.9 | 64.0 | 60.2 | - | - | - | 45.7 | 61.0 | 52.3 | 59.2 | 72.5 | 65.1 | 49.4 | 33.6 | 40.0 |
DLFZRL [11] | - | - | 61.2 | - | - | 60.9 | - | - | 51.9 | - | - | - | - | - | 42.5 |
cvcZSL [48] | 62.7 | 77.0 | 69.1 | 56.4 | 81.4 | 66.7 | 47.4 | 47.6 | 47.5 | - | - | - | 36.3 | 42.8 | 39.3 |
f-VAEGAN-D2 [29] | 57.9 | 61.4 | 59.6 | - | - | - | 43.7 | 57.7 | 49.7 | 59.0 | 73.8 | 65.6 | 42.6 | 36.6 | 39.4 |
LsrGAN [31] | 54.6 | 74.6 | 63.0 | - | - | - | 48.1 | 59.1 | 53.0 | - | - | - | 44.8 | 37.7 | 40.9 |
TF-VAEGAN [39] | - | - | - | 59.8 | 75.1 | 66.6 | 52.8 | 64.7 | 58.1 | 62.5 | 84.1 | 71.7 | 45.6 | 40.7 | 43.0 |
DR-GZSL [7] | 60.7 | 72.9 | 66.2 | 56.9 | 80.2 | 66.6 | 51.1 | 58.2 | 54.4 | - | - | - | 36.6 | 47.6 | 41.4 |
SDGZSL [10] | - | - | - | 64.6 | 73.6 | 68.8 | 59.9 | 66.4 | 63.0 | 62.2 | 79.3 | 69.8 | 48.2 | 36.1 | 41.3 |
CE-GZSL [13] | 65.3 | 73.4 | 69.1 | 63.1 | 78.6 | 70.0 | 63.9 | 66.8 | 65.3 | 69.0 | 78.7 | 73.5 | 48.8 | 38.6 | 43.1 |
FREE [14] | 62.9 | 69.4 | 66.0 | 60.4 | 75.4 | 67.1 | 55.7 | 59.9 | 57.7 | 67.4 | 84.5 | 75.0 | 47.4 | 37.2 | 41.7 |
Our PS-GZSL | 67.5 | 74.1 | 70.6 | 66.4 | 78.1 | 71.8 | 70.6 | 64.5 | 67.4 | 66.8 | 82.5 | 73.8 | 50.1 | 38.1 | 43.3 |
Methods | AWA1 | AWA2 | CUB | FLO | SUN |
---|---|---|---|---|---|
DEVISE [22] | 54.2 | 59.7 | 52.0 | 45.9 | 56.5 |
SJE [23] | 65.6 | 61.9 | 53.9 | 53.4 | 53.7 |
ALE [24] | 59.9 | 62.5 | 54.9 | 48.5 | 58.1 |
ESZSL [25] | 58.2 | 58.6 | 53.9 | 51.0 | 54.5 |
DCN [26] | 65.2 | - | 56.2 | - | 61.8 |
CADA-VAE [9] | - | 64.0 | 60.4 | 65.2 | 61.8 |
SP-AEN [8] | 58.5 | - | 55.4 | - | 59.2 |
cycle-CLSWGAN [30] | 66.3 | - | 58.4 | 70.1 | 60.0 |
DLFZRL [11] | 71.3 | 70.3 | 61.8 | - | 61.3 |
TCN [46] | 70.3 | 71.2 | 59.5 | - | 61.5 |
f-CLSWGAN [21] | 68.2 | - | 57.3 | 67.2 | 60.8 |
f-VAEGAN-D2 [29] | - | 71.1 | 61.0 | 67.7 | 64.7 |
TF-VAEGAN [39] | - | 72.2 | 64.9 | 70.8 | 66.0 |
AGZSL [12] | - | 72.8 | 76.0 | - | 63.3 |
SDGZSL [10] | - | 72.1 | 75.5 | 73.3 | 62.4 |
CE-GZSL [13] | 71.0 | 70.4 | 77.5 | 70.6 | 63.3 |
Ours PS-GZSL | 71.5 | 72.9 | 78.1 | 71.3 | 64.7 |
Version | AWA2 | CUB | ||||
---|---|---|---|---|---|---|
U | S | H | U | S | H | |
PS-GZSL w/o MoE&PS | 65.7 | 74.8 | 69.9 | 71.5 | 61.3 | 66.0 |
PS-GZSL w/o PS | 66.9 | 74.8 | 70.7 | 67.0 | 66.8 | 66.9 |
PS-GZSL w/o MoE | 61.4 | 79.8 | 69.4 | 68.4 | 63.1 | 65.6 |
PS-GZSL w/o w/ | 66.0 | 75.5 | 70.5 | 66.9 | 66.2 | 66.5 |
PS-GZSL w/o w/ | 65.7 | 77.8 | 71.2 | 67.5 | 66.8 | 67.2 |
PS-GZSL | 66.4 | 78.1 | 71.8 | 70.1 | 64.5 | 67.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, G.; Tang, S. Generalized Zero-Shot Image Classification via Partially-Shared Multi-Task Representation Learning. Electronics 2023, 12, 2085. https://doi.org/10.3390/electronics12092085
Wang G, Tang S. Generalized Zero-Shot Image Classification via Partially-Shared Multi-Task Representation Learning. Electronics. 2023; 12(9):2085. https://doi.org/10.3390/electronics12092085
Chicago/Turabian StyleWang, Gerui, and Sheng Tang. 2023. "Generalized Zero-Shot Image Classification via Partially-Shared Multi-Task Representation Learning" Electronics 12, no. 9: 2085. https://doi.org/10.3390/electronics12092085
APA StyleWang, G., & Tang, S. (2023). Generalized Zero-Shot Image Classification via Partially-Shared Multi-Task Representation Learning. Electronics, 12(9), 2085. https://doi.org/10.3390/electronics12092085