Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition
Abstract
:1. Introduction
- An objective function CLpCE is designed for balancing both unsupervised and supervised learning in the model fine-tuning stage to enhance the consistency of feature representation of sentence embeddings.
- A novel decoding method DCS is developed to improve the representation diversity and to relieve anisotropic distributions of token generation with maintained quality of text summarization.
- A supplementary metric named the maximum of token repetition ratio (maxTRR) is implemented which estimates the token repetition and determines the outcome of text generation.
- The effectiveness of the proposed CLpCEwDSC decoding framework is verified, and competitive performance and better diversity are observed on the DRG task.
2. Related Techniques
2.1. GPT-2 Decoder Block
2.2. Contrastive Learning of Sentence Embeddings
2.3. Contrastive Search Decoding
3. Materials and Methods
3.1. Data Collection
3.2. The Proposed CLpCEwDCS Decoding Framework
3.2.1. The Backbone Network Selection
3.2.2. The CLpCE Objective Function
3.2.3. The DCS Decoding
3.3. Experiment Design
3.4. Evaluation Metrics
3.5. Implementation Details and Parameter Settings
4. Results
4.1. DRG Accuracy
4.2. DRG Diversity
4.3. The Effect of the Diversity Control
4.4. Achievement of the First-Tier Teams on the Competition
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
DRG | Diagnostic report generation |
LLM | Large language model |
CLpCE | Contrastive learning penalized cross-entropy |
DCS | Diversity contrastive search |
maxTRR | Maximum of token repetition ratio |
GPT | Generative pre-trained Transformer |
CLpCEwDCS | CLpCE with DCS |
BLEU | Bilingual evaluation understudy |
METEOR | Evaluation of translation with explicit ordering |
ROUGE | Recall-oriented understudy for gisting evaluation |
CIDER | Consensus-based image description evaluation |
NLP | Natural language processing |
RNN | Recurrent neural network |
LSTM | Long-short term memory |
BERT | Bidirectional encoder representations from Transformers |
BART | Bidirectional and autoregressive Transformer |
T5 | Text-to-text transfer converter |
CL | Contrastive learning |
CLEAR | Contrastive learning for sentence representation |
TaCL | Token-aware contrastive learning |
GS | Greedy search |
NS | Nucleus search |
CS | Contrastive search |
FECS | Fidelity-enriched contrastive search |
RRM | repetition reduction module |
PT | Pre-training |
FT | Fine-tuning |
SimCSE | Simple contrastive learning of sentence embeddings |
TkS | Top-k search |
GBPQ | Graph beamsearch with prioirity queue |
RAG | Retrival augmented generation |
References
- Kryscinski, W.; Keskar, N.S.; McCann, B.; Xiong, C.; Socher, R. Neural text summarization: A critical evaluation. arXiv 2019, arXiv:1908.08960. [Google Scholar]
- Allahyari, M.; Pouriyeh, S.; Assefi, M.; Safaei, S.; Trippe, E.D.; Gutierrez, J.B.; Kochut, K. Text summarization techniques: A brief survey. arXiv 2017, arXiv:1707.02268. [Google Scholar] [CrossRef]
- Pang, T.; Li, P.; Zhao, L. A survey on automatic generation of medical imaging reports based on deep learning. Biomed. Eng. Online 2022, 22, 48. [Google Scholar] [CrossRef] [PubMed]
- Chen, Z.; Varma, M.; Delbrouck, J.; Paschali, M.; Blankemeier, L.; Van Veen, D.; Valanarasu, J.; Youssef, A.; Cohen, J.; Reis, E. CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation. arXiv 2024, arXiv:2401.12208. [Google Scholar]
- Jones, K.S. Automatic summarising: The state of the art. Inf. Process. Manag. 2007, 43, 1449–1481. [Google Scholar] [CrossRef]
- Minaee, S.; Kalchbrenner, N.; Cambria, E.; Nikzad, N.; Chenaghlu, M.; Gao, J. Deep learning–based text classification: A comprehensive review. ACM Comput. Surv. 2021, 54, 1–40. [Google Scholar] [CrossRef]
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
- Van Houdt, G.; Mosquera, C.; Napoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 6000–6010. [Google Scholar]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Paulus, R.; Xiong, C.; Socher, R. A deep reinforced model for abstractive summarization. arXiv 2017, arXiv:1705.04304. [Google Scholar]
- Chuang, Y.; Tang, R.; Jiang, X.; Hu, X. SPeC: A soft prompt-based calibration on performance variability of large language model in clinical notes summarization. J. Biomed. Inform. 2024, 151, 104606. [Google Scholar] [CrossRef] [PubMed]
- Tian, S.; Jin, Q.; Yeganova, L.; Lai, P.; Zhu, Q.; Chen, X.; Yang, Y.; Chen, Q.; Kim, W.; Comeau, D. Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Briefings Bioinform. 2024, 25, bbad493. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Li, D.; Savarese, S.; Hoi, S. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv 2023, arXiv:2301.12597. [Google Scholar]
- Van Veen, D.; Van Uden, C.; Blankemeier, L.; Delbrouck, J.; Aali, A.; Bluethgen, C.; Pareek, A.; Polacin, M.; Reis, E.; Seehofnerová, A. Adapted large language models can outperform medical experts in clinical text summarization. Nat. Med. 2024. [Google Scholar] [CrossRef]
- Dong, Y.; Cordonnier, J.-B.; Loukas, A. Attention is not all you need: Pure attention loses rank doubly exponentially with depth. In Proceedings of the 38th International Conference on Machine Learning, Virtual Event, 18–24 July 2021; pp. 2793–2803. [Google Scholar]
- Ethayarajh, K. How contextual are contextualized word representations? comparing the geometry of BERT, ELMO, and GPT-2 embeddings. arXiv 2019, arXiv:1909.00512. [Google Scholar]
- Su, Y.; Liu, F.; Meng, Z.; Lan, T.; Shu, L.; Shareghi, E.; Collier, N. Tacl: Improving bert pre-training with token-aware contrastive learning. arXiv 2021, arXiv:2111.04198. [Google Scholar]
- Su, Y.; Lan, T.; Wang, Y.; Yogatama, D.; Kong, L.; Collier, N. A contrastive framework for neural text generation. Adv. Neural Inf. Process. Syst. 2022, 35, 21548–21561. [Google Scholar]
- Li, B.; Zhou, H.; He, J.; Wang, M.; Yang, Y.; Li, L. On the sentence embeddings from pre-trained language models. arXiv 2020, arXiv:2011.05864. [Google Scholar]
- Wang, Z.; Zeng, J.; Tao, H.; Zhong, L. RBPSum: An extractive summarization approach using Bi-stream attention and position residual connection. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023; pp. 1–8. [Google Scholar]
- Abanoub, G.E.; Fawzy, A.M.; Waly, R.R.; Gomaa, W.H. Generate descriptions of medical dialogues through two-layers Transformer-based summarization. Intell. Method Syst. Appl. 2023, 32–37. [Google Scholar] [CrossRef]
- Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 5485–5551. [Google Scholar]
- Chuang, C.-Y.; Robinson, J.; Lin, Y.-C.; Torralba, A.; Jegelka, S. Debiased contrastive learning. Adv. Neural Inf. Process. Syst. 2020, 33, 8765–8775. [Google Scholar]
- Welleck, S.; Kulikov, I.; Roller, S.; Dinan, E.; Cho, K.; Weston, J. Neural text generation with unlikelihood training. arXiv 2019, arXiv:1908.04319. [Google Scholar]
- Wu, Z.; Wang, S.; Gu, J.; Khabsa, M.; Sun, F.; Ma, H. CLEAR: Contrastive learning for sentence representation. arXiv 2020, arXiv:2012.15466. [Google Scholar]
- Tan, C.; Sun, X. CoLRP: A contrastive learning abstractive text summarization method with ROUGE penalty. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023; pp. 1–7. [Google Scholar]
- Mai, T.P.; Nguyen, Q.A.; Can, D.C.; Le, H.Q. Contrastive hierarchical discourse graph for vietnamese extractive multi-document summarization. In Proceedings of the 2023 International Conference on Asian Language Processing (IALP), Singapore, 18–20 November 2023; pp. 118–123. [Google Scholar]
- Klein, G.; Kim, Y.; Deng, Y.; Senellart, J.; Rush, A. OpenNMT: Open-Source Toolkit for Neural Machine Translation. Annu. Meet. Assoc. Comput. Linguist. Syst. Demonstr. 2017, 35, 67–72. [Google Scholar]
- Holtzman, A.; Buys, J.; Du, L.; Forbes, M.; Choi, Y. The curious case of neural text degeneration. arXiv 2019, arXiv:1904.09751. [Google Scholar]
- Fu, Z.; Lam, W.; So, A.; Shi, B. A theoretical analysis of the repetition problem in text generation. Proc. AAAI Conf. Artif. Intell. 2021, 35, 12848–12856. [Google Scholar] [CrossRef]
- Su, Y.; Xu, J. An empirical study on contrastive search and contrastive decoding for open-ended text generation. arXiv 2022, arXiv:2211.10797. [Google Scholar]
- Chen, W.L.; Wu, C.K.; Chen, H.H.; Chen, C.C. Fidelity-enriched contrastive search: Reconciling the faithfulness-diversity trade-off in text generation. arXiv 2023, arXiv:2310.14981. [Google Scholar]
- Zhang, Y.; Kamigaito, H.; Aoki, T.; Takamura, H.; Okumura, M. Generic Mechanism for Reducing Repetitions in Encoder-Decoder Models. J. Nat. Lang. Process. 2023, 30, 401–431. [Google Scholar] [CrossRef]
- Xu, J.; Liu, X.; Yan, J.; Cai, D.; Li, H.; Li, J. Learning to break the loop: Analyzing and mitigating repetitions for neural text generation. Adv. Neural Inf. Process. Syst. 2022, 35, 3082–3095. [Google Scholar]
- Hadsell, R.; Chopra, S.; LeCun, Y. Dimensionality reduction by learning an invariant mapping. IEEE Comput. Vis. Pattern Recognit. 2006, 2, 1735–1742. [Google Scholar]
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A simple framework for contrastive learning of visual representations. Int. Conf. Mach. Learn. 2020, 119, 1597–1607. [Google Scholar]
- Du, Z. GPT2-Chinese: Tools for Training GPT2 Model in Chinese Language; GitHub Repository, 2019. [Google Scholar]
- Shao, Y.; Geng, Z.; Liu, Y.; Dai, J.; Yan, H.; Yang, F.; Zhe, L.; Bao, H.; Qiu, X. CPT: A pre-trained unbalanced transformer for both chinese language understanding and generation. arXiv 2021, arXiv:2109.05729. [Google Scholar]
- Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. GPT-4 technical report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Fan, A.; Lewis, M.; Dauphin, Y. Hierarchical neural story generation. arXiv 2018, arXiv:1805.04833. [Google Scholar]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.-J. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 6–12 July 2002; pp. 311–318. [Google Scholar]
- Banerjee, S.; Lavie, A. METEOR: An automatic metric for mt evaluation with improved correlation with human judgments. In Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization; Association for Computational Linguistics: Toronto, ON, Canada, 2005; pp. 65–72. [Google Scholar]
- Lin, C.-Y. ROUGE: A Package for Automatic Evaluation of Summaries. Text Summarization Branches Out. 2004; pp. 74–81. Available online: https://aclanthology.org/W04-1013.pdf (accessed on 19 March 2024).
- Vedantam, R.; Lawrence Zitnick, C.; Parikh, D. Cider: Consensus-based image description evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 4566–4575. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
- Wu, L.; Li, J.; Wang, Y.; Meng, Q.; Qin, T.; Chen, W.; Zhang, M.; Liu, T. R-drop: Regularized dropout for neural networks. Adv. Neural Inf. Process. Syst. 2021, 34, 10890–10905. [Google Scholar]
- Izmailov, P.; Podoprikhin, D.; Garipov, T.; Vetrov, D.; Wilson, A. Averaging weights leads to wider optima and better generalization. arXiv 2018, arXiv:1803.05407. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Wu, X.; Gao, Y.; Zhang, H.; Yang, Y.; Guo, W.; Lu, J. The Solution for the CVPR2023 NICE Image Captioning Challenge. arXiv 2023, arXiv:2310.06879. [Google Scholar]
- Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.; Rocktäschel, T. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv. Neural Inf. Process. Syst. 2020, 33, 9459–9474. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Du, Z.; Qian, Y.; Liu, X.; Ding, M.; Qiu, J.; Yang, Z.; Tang, J. GLM: General language model pretraining with autoregressive blank infilling. arXiv 2022, arXiv:2103.10360. [Google Scholar]
- Baevski, A.; Hsu, W.-N.; Xu, Q.; Babu, A.; Gu, J.; Auli, M. Data2vec: A general framework for self-supervised learning in speech, vision and language. In Proceedings of the 39th International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; pp. 1298–1312. [Google Scholar]
- Uc-Cetina, V.; Navarro-Guerrero, N.; Martin-Gonzalez, A.; Weber, C.; Wermter, S. Survey on reinforcement learning for language processing. Artif. Intell. Rev. 2023, 56, 1543–1575. [Google Scholar] [CrossRef]
- Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 2022, 35, 27730–27744. [Google Scholar]
BLEU | METEOR | ROUGE | CIDER | |||||
---|---|---|---|---|---|---|---|---|
BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | |||||
DCS | 0.0 (CE) | 0.4638 | 0.3838 | 0.3223 | 0.2724 | 0.2487 | 0.5057 | 1.3607 |
0.1 | 0.4870 | 0.4049 | 0.3414 | 0.2893 | 0.2585 | 0.5170 | 1.4179 | |
0.2 | 0.4864 | 0.4047 | 0.3414 | 0.2893 | 0.2586 | 0.5186 | 1.4118 | |
0.3 | 0.4794 | 0.3987 | 0.3363 | 0.2853 | 0.2561 | 0.5154 | 1.4162 | |
0.4 | 0.4784 | 0.3979 | 0.3358 | 0.2851 | 0.2560 | 0.5179 | 1.4391 | |
0.5 | 0.4805 | 0.3997 | 0.3372 | 0.2860 | 0.2564 | 0.5154 | 1.4147 | |
0.6 (CLpCE) | 0.4937 | 0.4107 | 0.3461 | 0.2933 | 0.2612 | 0.5182 | 1.4339 | |
0.7 | 0.4855 | 0.4040 | 0.3409 | 0.2894 | 0.2586 | 0.5199 | 1.4459 | |
0.8 | 0.4854 | 0.4033 | 0.3400 | 0.2884 | 0.2582 | 0.5195 | 1.4533 | |
0.9 | 0.4780 | 0.3968 | 0.3342 | 0.2834 | 0.2549 | 0.5147 | 1.4132 | |
1.0 (CL) | 0.0232 | 0.0013 | 0.0002 | 0.0000 | 0.0264 | 0.0284 | 0.0002 | |
CS | 0.0 (CE) | 0.4645 | 0.3843 | 0.3227 | 0.2727 | 0.2491 | 0.5059 | 1.3611 |
0.1 | 0.4858 | 0.4039 | 0.3406 | 0.2887 | 0.2579 | 0.5166 | 1.4196 | |
0.2 | 0.4866 | 0.4045 | 0.3410 | 0.2890 | 0.2585 | 0.5178 | 1.4125 | |
0.3 | 0.4793 | 0.3987 | 0.3364 | 0.2856 | 0.2562 | 0.5159 | 1.4101 | |
0.4 | 0.4767 | 0.3965 | 0.3346 | 0.2841 | 0.2552 | 0.5169 | 1.4255 | |
0.5 | 0.4810 | 0.4003 | 0.3378 | 0.2867 | 0.2568 | 0.5162 | 1.4240 | |
0.6 (CLpCE) | 0.4939 | 0.4112 | 0.3470 | 0.2943 | 0.2616 | 0.5198 | 1.4477 | |
0.7 | 0.4856 | 0.4042 | 0.3410 | 0.2894 | 0.2587 | 0.5196 | 1.4395 | |
0.8 | 0.4864 | 0.4043 | 0.3408 | 0.2892 | 0.2586 | 0.5198 | 1.4525 | |
0.9 | 0.4798 | 0.3984 | 0.3355 | 0.2845 | 0.2558 | 0.5156 | 1.4208 | |
1.0 (CL) | 0.0233 | 0.0012 | 0.0000 | 0.0000 | 0.0266 | 0.0286 | 0.0002 | |
GS | 0.0 (CE) | 0.4681 | 0.3887 | 0.3274 | 0.2773 | 0.2489 | 0.5095 | 1.4090 |
0.1 | 0.4846 | 0.4036 | 0.3410 | 0.2898 | 0.2580 | 0.5210 | 1.4567 | |
0.2 | 0.4881 | 0.4063 | 0.3431 | 0.2914 | 0.2592 | 0.5231 | 1.4684 | |
0.3 | 0.4796 | 0.3999 | 0.3381 | 0.2875 | 0.2568 | 0.5199 | 1.4477 | |
0.4 | 0.4809 | 0.4002 | 0.3376 | 0.2865 | 0.2567 | 0.5214 | 1.4542 | |
0.5 | 0.4834 | 0.4034 | 0.3413 | 0.2904 | 0.2580 | 0.5213 | 1.4682 | |
0.6 (CLpCE) | 0.4901 | 0.4088 | 0.3458 | 0.2941 | 0.2611 | 0.5246 | 1.4861 | |
0.7 | 0.4894 | 0.4077 | 0.3443 | 0.2925 | 0.2602 | 0.5247 | 1.4835 | |
0.8 | 0.4865 | 0.4053 | 0.3424 | 0.2910 | 0.2591 | 0.5244 | 1.4864 | |
0.9 | 0.4812 | 0.3998 | 0.3370 | 0.2860 | 0.2559 | 0.5186 | 1.4583 | |
1.0 (CL) | 0.0122 | 0.0009 | 0.0000 | 0.0000 | 0.0126 | 0.0169 | 0.0000 | |
NS | 0.0 (CE) | 0.4654 | 0.3790 | 0.3136 | 0.2616 | 0.2422 | 0.4859 | 1.2368 |
0.1 | 0.4765 | 0.3907 | 0.3254 | 0.2728 | 0.2492 | 0.4996 | 1.3073 | |
0.2 | 0.4800 | 0.3944 | 0.3290 | 0.2763 | 0.2511 | 0.5017 | 1.2831 | |
0.3 | 0.4775 | 0.3925 | 0.3278 | 0.2757 | 0.2501 | 0.5009 | 1.3221 | |
0.4 | 0.4793 | 0.3939 | 0.3285 | 0.2759 | 0.2504 | 0.5010 | 1.3049 | |
0.5 | 0.4798 | 0.3944 | 0.3292 | 0.2766 | 0.2512 | 0.5017 | 1.3014 | |
0.6 (CLpCE) | 0.4858 | 0.3991 | 0.3326 | 0.2789 | 0.2535 | 0.5044 | 1.3143 | |
0.7 | 0.4799 | 0.3942 | 0.3288 | 0.2758 | 0.2511 | 0.5033 | 1.3322 | |
0.8 | 0.4803 | 0.3942 | 0.3286 | 0.2758 | 0.2511 | 0.5029 | 1.3259 | |
0.9 | 0.4737 | 0.3878 | 0.3226 | 0.2703 | 0.2473 | 0.4961 | 1.2776 | |
1.0 (CL) | 0.0184 | 0.0007 | 0.0000 | 0.0000 | 0.0216 | 0.0226 | 0.0003 | |
TkS | 0.0 (CE) | 0.4499 | 0.3554 | 0.2852 | 0.2304 | 0.2283 | 0.4542 | 0.9854 |
0.1 | 0.4627 | 0.3686 | 0.2986 | 0.2436 | 0.2360 | 0.4701 | 1.0701 | |
0.2 | 0.4664 | 0.3712 | 0.3004 | 0.2447 | 0.2371 | 0.4718 | 1.0741 | |
0.3 | 0.4582 | 0.3651 | 0.2956 | 0.2410 | 0.2342 | 0.4681 | 1.0553 | |
0.4 | 0.4638 | 0.3695 | 0.2988 | 0.2434 | 0.2361 | 0.4710 | 1.0584 | |
0.5 | 0.4624 | 0.3687 | 0.2987 | 0.2437 | 0.2359 | 0.4700 | 1.0584 | |
0.6 (CLpCE) | 0.4730 | 0.3775 | 0.3059 | 0.2496 | 0.2402 | 0.4715 | 1.0676 | |
0.7 | 0.4654 | 0.3713 | 0.3008 | 0.2449 | 0.2376 | 0.4712 | 1.0555 | |
0.8 | 0.4702 | 0.3745 | 0.3032 | 0.2470 | 0.2389 | 0.4743 | 1.0807 | |
0.9 | 0.4613 | 0.3672 | 0.2969 | 0.2421 | 0.2352 | 0.4689 | 1.0557 | |
1.0 (CL) | 0.0206 | 0.0016 | 0.0000 | 0.0000 | 0.0225 | 0.0266 | 0.0002 |
DCS | CS | GS | NS | TkS | |
---|---|---|---|---|---|
maxTRR | 0.12 ± 0.09 | 0.22 ± 0.13 | 0.24 ± 0.15 | 0.27 ± 0.13 | 0.29 ± 0.16 |
Desensitized Data Description | |
---|---|
case A input | 14 108 30 13 20 18 23 21 10 14 32 16 39 27 47 51 31 29 20 18 10 24 42 26 37 61 24 10 40 13 45 163 45 39 159 49 50 204 37 21 157 155 10 |
CS output | 150 50 107 104 113 110 15 13 31 29 20 (maxTRR, 1/11) |
DCS output | (1) 150 50 107 66 17 81 76 33 81 10 (maxTRR, 1/10) |
(2) 150 50 107 80 33 17 13 31 81 60 49 29 (maxTRR, 1/12) | |
(3) 150 50 107 80 33 17 81 76 33 31 81 60 49 29 (maxTRR, 1/14) | |
(4) 150 50 65 107 29 113 15 29 20 60 49 29 (maxTRR, 3/12) | |
case B input | 83 12 38 41 17 1074 96 17 552 48 17 27 131 17 89 65 69 70 11 149 58 51 36 82 11 34 38 41 17 40 153 44 23 21 25 11 263 256 567 28 59 11 199 54 894 141 126 231 11 45 83 207 281 240 353 300 212 491 302 237 297 300 212 11 113 110 104 259 207 281 315 286 258 280 11 22 12 96 16 35 12 38 41 17 178 58 36 82 10 22 279 33 91 72 78 11 33 24 122 61 24 10 22 12 62 33 628 51 171 82 11 33 686 170 1119 11 22 12 119 17 143 175 105 744 26 37 72 78 11 22 12 38 41 17 210 143 170 179 10 |
CS output | 190 57 190 190 190 79 10 (maxTRR, 4/7) |
DCS output | (1) 49 75 100 344 282 11 57 49 77 75 100 57 92 10 (maxTRR, 2/14) |
(2) 49 75 100 344 282 49 57 49 77 75 100 57 92 10 (maxTRR, 3/14) | |
(3) 49 369 142 49 180 372 11 369 372 11 180 372 11 440 439 139 420 11 117 175 13 29 440 439 11 202 191 200 487 365 175 98 10 (maxTRR, 2/33) | |
(4) 49 369 142 49 180 372 11 369 372 11 180 372 11 440 439 139 420 11 117 487 384 440 439 11 202 191 175 98 278 10 (maxTRR, 2/30) |
BLEU | METEOR | ROUGE | CIDER | ||||
---|---|---|---|---|---|---|---|
BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | ||||
0.00 | 0.4939 | 0.4112 | 0.3470 | 0.2943 | 0.2616 | 0.5198 | 1.4477 |
0.01 | 0.4939 | 0.4111 | 0.3470 | 0.2942 | 0.2612 | 0.5190 | 1.4459 |
0.05 | 0.4939 | 0.4113 | 0.3466 | 0.2940 | 0.2613 | 0.5188 | 1.4445 |
0.10 | 0.4937 | 0.4107 | 0.3461 | 0.2933 | 0.2612 | 0.5182 | 1.4339 |
Team | Main Procedure in Diagnosis Report Generation | Score |
---|---|---|
A | CPT-base + noise-aware similarity bucketing + fine-tuning | 2.327 |
B | BART-large + GBPQ + fine-tuning | 2.297 |
C | (CPT-base + BART-base) + RAG + fine-tuning | 2.285 |
D | BART-large + fine-tuning | 2.272 |
E | BART-large + fine-tuning | 2.263 |
F | BART-large + fine-tuning | 2.249 |
ours | GPT2-Chinese + fine-tuning + CLpCEwDCS decoding | 2.135 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, T.; Meng, J.; Yang, Y.; Yu, S. Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition. Appl. Sci. 2024, 14, 2817. https://doi.org/10.3390/app14072817
Zhang T, Meng J, Yang Y, Yu S. Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition. Applied Sciences. 2024; 14(7):2817. https://doi.org/10.3390/app14072817
Chicago/Turabian StyleZhang, Taozheng, Jiajian Meng, Yuseng Yang, and Shaode Yu. 2024. "Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition" Applied Sciences 14, no. 7: 2817. https://doi.org/10.3390/app14072817
APA StyleZhang, T., Meng, J., Yang, Y., & Yu, S. (2024). Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition. Applied Sciences, 14(7), 2817. https://doi.org/10.3390/app14072817