applsci-logo

Journal Browser

Journal Browser

Natural Language Processing: Novel Methods and Applications

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 December 2024 | Viewed by 11278

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, Università di Torino, 10124 Turin, Italy
Interests: text summarization; chatbot; word embedding; natural language processing

E-Mail Website
Guest Editor
Department of Computer Science, Università di Torino, 10124 Turin, Italy
Interests: text mining; information extraction; knowledge management; process mining; network analysis

Special Issue Information

Dear Colleagues,

We are inviting submissions to this Special Issue, entitled “Natural Language Processing: Novel Methods and Applications”.

Natural language processing is becoming ubiquitous in our life. It is used in several fields and contexts, ranging from simple voice assistants for house and car automation to complex systems that retrieve similar judicial cases for the court.

In the last decade, the availability of a large volume and variety of textual documents has attracted the interest of several scientific and humanistic areas. Therefore, the main objective of the Special Issue is to foster a shared view on this topic, integrating ideas and real-case applications from different communities.

In this Special Issue, we invite submissions exploring novel research frontiers and recent advances in this field, demonstrating how the interaction of different communities (such as Psychology, Law, etc.) and research fields (such as logic, human–computer interaction, deep learning, etc.) can both benefit from natural language processing and benefit the process itself. 

Dr. Giovanni Siragusa
Dr. Emilio Sulis
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine translation
  • irony and sarcasm detection
  • conversational agents
  • sentiment analysis
  • parsing and grammar formalism
  • speech recognition
  • text summarization
  • lexical semantics
  • linguistic resources
  • contrastive learning
  • social media mining
  • recommendation systems
  • information retrieval and semantic search
  • human–computer interaction

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 635 KiB  
Article
TAWC: Text Augmentation with Word Contributions for Imbalance Aspect-Based Sentiment Classification
by Noviyanti Santoso, Israel Mendonça and Masayoshi Aritsugi
Appl. Sci. 2024, 14(19), 8738; https://doi.org/10.3390/app14198738 - 27 Sep 2024
Viewed by 595
Abstract
Text augmentation plays an important role in enhancing the generalizability of language models. However, traditional methods often overlook the unique roles that individual words play in conveying meaning in text and imbalance class distribution, thereby risking suboptimal performance and compromising the model’s generalizability. [...] Read more.
Text augmentation plays an important role in enhancing the generalizability of language models. However, traditional methods often overlook the unique roles that individual words play in conveying meaning in text and imbalance class distribution, thereby risking suboptimal performance and compromising the model’s generalizability. This limitation motivated us to develop a novel technique called Text Augmentation with Word Contributions (TAWC). Our approach tackles this problem in two core steps: Firstly, it employs analytical correlation and semantic similarity metrics to discern the relationships between words and their associated aspect polarities. Secondly, it tailors distinct augmentation strategies to individual words based on their identified functional contributions in the text. Extensive experiments on two aspect-based sentiment analysis datasets demonstrate that the proposed TAWC model significantly improves the classification performances of popular language models, achieving gains of up to 4% compared with the case of data without augmentation, thereby setting a new standard in the field of text augmentation. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

26 pages, 6325 KiB  
Article
Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based)
by Chanthol Eang and Seungjae Lee
Appl. Sci. 2024, 14(18), 8388; https://doi.org/10.3390/app14188388 - 18 Sep 2024
Viewed by 1971
Abstract
This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This [...] Read more.
This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This dataset consists of movie review sentences, each labeled with either positive or negative sentiment, making it a binary classification task. Recurrent Neural Networks (RNNs) are effective for text classification because they capture the sequential nature of language, which is crucial for understanding context and meaning. Bert excels in text classification by providing bidirectional context, generating contextual embeddings, and leveraging pre-training on large corpora. This allows Bert to capture nuanced meanings and relationships within the text effectively. Combining Bert with RNNs can be highly effective for text classification. Bert’s bidirectional context and rich embeddings provide a deep understanding of the text, while RNNs capture sequential patterns and long-range dependencies. Together, they leverage the strengths of both architectures, leading to improved performance on complex classification tasks. Next, we also developed an integration of the Bert model and a K-Nearest Neighbor based (KNN_Bert_based) method as a comparative scheme for our proposed work. Based on the results of experimentation, our proposed model outperforms traditional text classification models as well as existing models in terms of accuracy. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

18 pages, 1519 KiB  
Article
An Investigation of Applying Large Language Models to Spoken Language Learning
by Yingming Gao, Baorian Nuchged, Ya Li and Linkai Peng
Appl. Sci. 2024, 14(1), 224; https://doi.org/10.3390/app14010224 - 26 Dec 2023
Cited by 2 | Viewed by 3663
Abstract
People have long desired intelligent conversational systems that can provide assistance in practical scenarios. The latest advancements in large language models (LLMs) are making significant strides toward turning this aspiration into a tangible reality. LLMs are believed to hold the most potential and [...] Read more.
People have long desired intelligent conversational systems that can provide assistance in practical scenarios. The latest advancements in large language models (LLMs) are making significant strides toward turning this aspiration into a tangible reality. LLMs are believed to hold the most potential and value in education, especially in the creation of AI-driven virtual teachers that facilitate language learning. This study focuses on assessing the effectiveness of LLMs within the educational domain, specifically in the areas of spoken language learning, which encompass phonetics, phonology, and second language acquisition. To this end, we first introduced a new multiple-choice question dataset to evaluate the effectiveness of LLMs in the aforementioned scenarios, including the understanding and application of spoken language knowledge. Moreover, we investigated the influence of various prompting techniques such as zero- and few-shot methods (prepending the question with question-answer exemplars), chain-of-thought (CoT) prompting, in-domain exemplars, and external tools. We conducted a comprehensive evaluation of popular LLMs (20 distinct models) using these methods. The experimental results showed that the task of extracting conceptual knowledge posed few challenges for these LLMs, whereas the task of application questions was relatively difficult. In addition, some widely proven effective prompting methods combined with domain-specific examples resulted in significant performance improvements compared to the zero-shot baselines. Additionally, some other preliminary experiments also demonstrated the strengths and weaknesses of different LLMs. The findings of this study can shed light on the application of LLMs to spoken language learning. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

13 pages, 829 KiB  
Article
Boosting Lightweight Sentence Embeddings with Knowledge Transfer from Advanced Models: A Model-Agnostic Approach
by Kadir Gunel and Mehmet Fatih Amasyali
Appl. Sci. 2023, 13(23), 12586; https://doi.org/10.3390/app132312586 - 22 Nov 2023
Viewed by 1163
Abstract
In this study, we investigate knowledge transfer between two distinct sentence embedding models: a computationally demanding, highly performant model and a lightweight model derived from word vector averaging. Our objective is to augment the representational power of the lightweight model by exploiting the [...] Read more.
In this study, we investigate knowledge transfer between two distinct sentence embedding models: a computationally demanding, highly performant model and a lightweight model derived from word vector averaging. Our objective is to augment the representational power of the lightweight model by exploiting the sophisticated features of the robust model. Diverging from traditional knowledge distillation methods that align logits or hidden states of teacher and student models, our approach uses only the output sentence vectors of the teacher model for the alignment with the student models’s word vector representations. We implement two minimization techniques for this purpose: distance minimization and distance and perplexity minimization Our methodology uses WMT datasets for training, and the enhanced embeddings are validated via Google’s Analogy tasks and Meta’s SentEval datasets. We found that our proposed models intriguingly retained and conveyed information in a model-specific fashion. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

25 pages, 1513 KiB  
Article
Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences
by Yunlong Fan, Bin Li, Yikemaiti Sataer, Miao Gao, Chuanqi Shi, Siyi Cao and Zhiqiang Gao
Appl. Sci. 2023, 13(16), 9412; https://doi.org/10.3390/app13169412 - 19 Aug 2023
Cited by 3 | Viewed by 1651
Abstract
Most natural-language-processing (NLP) tasks suffer performance degradation when encountering long complex sentences, such as semantic parsing, syntactic parsing, machine translation, and text summarization. Previous works addressed the issue with the intuition of decomposing complex sentences and linking simple ones, such as rhetorical-structure-theory (RST)-style [...] Read more.
Most natural-language-processing (NLP) tasks suffer performance degradation when encountering long complex sentences, such as semantic parsing, syntactic parsing, machine translation, and text summarization. Previous works addressed the issue with the intuition of decomposing complex sentences and linking simple ones, such as rhetorical-structure-theory (RST)-style discourse parsing, split-and-rephrase (SPRP), text simplification (TS), simple sentence decomposition (SSD), etc. However, these works are not applicable for semantic parsing such as abstract meaning representation (AMR) parsing and semantic dependency parsing due to misalignments with semantic relations and unavailabilities to preserve the original semantics. Following the same intuition and avoiding the deficiencies of previous works, we propose a novel framework, hierarchical clause annotation (HCA), for capturing clausal structures of complex sentences, based on the linguistic research of clause hierarchy. With the HCA framework, we annotated a large HCA corpus to explore the potentialities of integrating HCA structural features into semantic parsing with complex sentences. Moreover, we decomposed HCA into two subtasks, i.e., clause segmentation and clause parsing, and provide neural baseline models for more-silver annotations. In evaluating the proposed models on our manually annotated HCA dataset, the performances of clause segmentation and parsing resulted in 91.3% F1-scores and 88.5% Parseval scores, respectively. Due to the same model architectures employed, the performance differences of the clause/discourse segmentation and parsing subtasks was reflected in our HCA corpus and compared discourse corpora, where our sentences contained more segment units and fewer interrelations than those in the compared corpora. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

18 pages, 2814 KiB  
Article
A Multitask Cross-Lingual Summary Method Based on ABO Mechanism
by Qing Li, Weibing Wan and Yuming Zhao
Appl. Sci. 2023, 13(11), 6723; https://doi.org/10.3390/app13116723 - 31 May 2023
Cited by 1 | Viewed by 1183
Abstract
Recent cross-lingual summarization research has pursued the use of a unified end-to-end model which has demonstrated a certain level of improvement in performance and effectiveness, but this approach stitches together multiple tasks and makes the computation more complex. Less work has focused on [...] Read more.
Recent cross-lingual summarization research has pursued the use of a unified end-to-end model which has demonstrated a certain level of improvement in performance and effectiveness, but this approach stitches together multiple tasks and makes the computation more complex. Less work has focused on alignment relationships across languages, which has led to persistent problems of summary misordering and loss of key information. For this reason, we first simplify the multitasking by converting the translation task into an equal proportion of cross-lingual summary tasks so that the model can perform only cross-lingual summary tasks when generating cross-lingual summaries. In addition, we splice monolingual and cross-lingual summary sequences as an input so that the model can fully learn the core content of the corpus. Then, we propose a reinforced regularization method based on the model to improve its robustness, and build a targeted ABO mechanism to enhance the semantic relationship alignment and key information retention of the cross-lingual summaries. Ablation experiments are conducted on three datasets of different orders of magnitude to demonstrate the effective enhancement of the model by the optimization approach; they outperform the mainstream approaches on the cross-lingual summarization task and the monolingual summarization task for the full dataset. Finally, we validate the model’s capabilities on a cross-lingual summary dataset of professional domains, and the results demonstrate its superior performance and ability to improve cross-lingual sequencing. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

Back to TopTop