A Study on Text Classification in the Age of Large Language Models

Trust, Paul; Minghim, Rosane

doi:10.3390/make6040129

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

A Study on Text Classification in the Age of Large Language Models

by

Paul Trust

^*,†

and

Rosane Minghim

^†

School of Computer Science and Information Technology, University College Cork, T12 K8AF Cork, Ireland

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mach. Learn. Knowl. Extr. 2024, 6(4), 2688-2721; https://doi.org/10.3390/make6040129

Submission received: 10 August 2024 / Revised: 3 November 2024 / Accepted: 12 November 2024 / Published: 21 November 2024

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

Download Versions Notes

Abstract

Large language models (LLMs) have recently made significant advances, excelling in tasks like question answering, summarization, and machine translation. However, their enormous size and hardware requirements make them less accessible to many in the machine learning community. To address this, techniques such as quantization, prefix tuning, weak supervision, low-rank adaptation, and prompting have been developed to customize these models for specific applications. While these methods have mainly improved text generation, their implications for the text classification task are not thoroughly studied. Our research intends to bridge this gap by investigating how variations like model size, pre-training objectives, quantization, low-rank adaptation, prompting, and various hyperparameters influence text classification tasks. Our overall conclusions show the following: 1—even with synthetic labels, fine-tuning works better than prompting techniques, and increasing model size does not always improve classification performance; 2—discriminatively trained models generally perform better than generatively pre-trained models; and 3—fine-tuning models at 16-bit precision works much better than using 8-bit or 4-bit models, but the performance drop from 8-bit to 4-bit is smaller than from 16-bit to 8-bit. In another scale of our study, we conducted experiments with different settings for low-rank adaptation (LoRA) and quantization, finding that increasing LoRA dropout negatively affects classification performance. We did not find a clear link between the LoRA attention dimension (rank) and performance, observing only small differences between standard LoRA and its variants like rank-stabilized LoRA and weight-decomposed LoRA. Additional observations to support model setup for classification tasks are presented in our analyses.

Keywords: text classification; language models; fine-tuning; prompting

Share and Cite

MDPI and ACS Style

Trust, P.; Minghim, R. A Study on Text Classification in the Age of Large Language Models. Mach. Learn. Knowl. Extr. 2024, 6, 2688-2721. https://doi.org/10.3390/make6040129

AMA Style

Trust P, Minghim R. A Study on Text Classification in the Age of Large Language Models. Machine Learning and Knowledge Extraction. 2024; 6(4):2688-2721. https://doi.org/10.3390/make6040129

Chicago/Turabian Style

Trust, Paul, and Rosane Minghim. 2024. "A Study on Text Classification in the Age of Large Language Models" Machine Learning and Knowledge Extraction 6, no. 4: 2688-2721. https://doi.org/10.3390/make6040129

APA Style

Trust, P., & Minghim, R. (2024). A Study on Text Classification in the Age of Large Language Models. Machine Learning and Knowledge Extraction, 6(4), 2688-2721. https://doi.org/10.3390/make6040129

Article Menu

A Study on Text Classification in the Age of Large Language Models

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI