Mathematics

Editorial

Jump to: Research, Other

5 pages, 185 KiB

Open AccessEditorial

Preface to the Special Issue “Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications”

by Florentina Hristea and Cornelia Caragea

Mathematics 2022, 10(14), 2481; https://doi.org/10.3390/math10142481 - 16 Jul 2022

Viewed by 1468

Abstract

Natural language processing (NLP) is one of the most important technologies in use today, especially due to the large and growing amount of online text, which needs to be understood in order to fully ascertain its enormous value [...] Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

Research

Jump to: Editorial, Other

25 pages, 3539 KiB

Open AccessArticle

Analytics Methods to Understand Information Retrieval Effectiveness—A Survey

by Josiane Mothe

Mathematics 2022, 10(12), 2135; https://doi.org/10.3390/math10122135 - 19 Jun 2022

Cited by 7 | Viewed by 3195 | Correction

Abstract

Information retrieval aims to retrieve the documents that answer users’ queries. A typical search process consists of different phases for which a variety of components have been defined in the literature; each one having a set of hyper-parameters to tune. Different studies focused [...] Read more.

Information retrieval aims to retrieve the documents that answer users’ queries. A typical search process consists of different phases for which a variety of components have been defined in the literature; each one having a set of hyper-parameters to tune. Different studies focused on how and how much the components and their hyper-parameters affect the system performance in terms of effectiveness, others on the query factor. The aim of these studies is to better understand information retrieval system effectiveness. This paper reviews the literature of this domain. It depicts how data analytics has been used in IR to gain a better understanding of system effectiveness. This review concludes that we lack a full understanding of system effectiveness related to the context which the system is in, though it has been possible to adapt the query processing to some contexts successfully. This review also concludes that, even if it is possible to distinguish effective from non-effective systems for a query set, neither the system component analysis nor the query features analysis were successful in explaining when and why a particular system fails on a particular query. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

21 pages, 443 KiB

Open AccessArticle

On the Use of Morpho-Syntactic Description Tags in Neural Machine Translation with Small and Large Training Corpora

by Gregor Donaj and Mirjam Sepesy Maučec

Mathematics 2022, 10(9), 1608; https://doi.org/10.3390/math10091608 - 9 May 2022

Cited by 4 | Viewed by 2152

Abstract

With the transition to neural architectures, machine translation achieves very good quality for several resource-rich languages. However, the results are still much worse for languages with complex morphology, especially if they are low-resource languages. This paper reports the results of a systematic analysis [...] Read more.

With the transition to neural architectures, machine translation achieves very good quality for several resource-rich languages. However, the results are still much worse for languages with complex morphology, especially if they are low-resource languages. This paper reports the results of a systematic analysis of adding morphological information into neural machine translation system training. Translation systems presented and compared in this research exploit morphological information from corpora in different formats. Some formats join semantic and grammatical information and others separate these two types of information. Semantic information is modeled using lemmas and grammatical information using Morpho-Syntactic Description (MSD) tags. Experiments were performed on corpora of different sizes for the English–Slovene language pair. The conclusions were drawn for a domain-specific translation system and for a translation system for the general domain. With MSD tags, we improved the performance by up to 1.40 and 1.68 BLEU points in the two translation directions. We found that systems with training corpora in different formats improve the performance differently depending on the translation direction and corpora size. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

14 pages, 490 KiB

Open AccessArticle

Identifying Source-Language Dialects in Translation

by Sergiu Nisioi, Ana Sabina Uban and Liviu P. Dinu

Mathematics 2022, 10(9), 1431; https://doi.org/10.3390/math10091431 - 24 Apr 2022

Cited by 2 | Viewed by 2529

Abstract

In this paper, we aim to explore the degree to which translated texts preserve linguistic features of dialectal varieties. We release a dataset of augmented annotations to the Proceedings of the European Parliament that cover dialectal speaker information, and we analyze different classes [...] Read more.

In this paper, we aim to explore the degree to which translated texts preserve linguistic features of dialectal varieties. We release a dataset of augmented annotations to the Proceedings of the European Parliament that cover dialectal speaker information, and we analyze different classes of written English covering native varieties from the British Isles. Our analyses aim to discuss the discriminatory features between the different classes and to reveal words whose usage differs between varieties of the same language. We perform classification experiments and show that automatically distinguishing between the dialectal varieties is possible with high accuracy, even after translation, and propose a new explainability method based on embedding alignments in order to reveal specific differences between dialects at the level of the vocabulary. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

26 pages, 2077 KiB

Open AccessArticle

Taylor-ChOA: Taylor-Chimp Optimized Random Multimodal Deep Learning-Based Sentiment Classification Model for Course Recommendation

by Santosh Kumar Banbhrani, Bo Xu, Hongfei Lin and Dileep Kumar Sajnani

Mathematics 2022, 10(9), 1354; https://doi.org/10.3390/math10091354 - 19 Apr 2022

Cited by 7 | Viewed by 2106

Abstract

Course recommendation is a key for achievement in a student’s academic path. However, it is challenging to appropriately select course content among numerous online education resources, due to the differences in users’ knowledge structures. Therefore, this paper develops a novel sentiment classification approach [...] Read more.

Course recommendation is a key for achievement in a student’s academic path. However, it is challenging to appropriately select course content among numerous online education resources, due to the differences in users’ knowledge structures. Therefore, this paper develops a novel sentiment classification approach for recommending the courses using Taylor-chimp Optimization Algorithm enabled Random Multimodal Deep Learning (Taylor ChOA-based RMDL). Here, the proposed Taylor ChOA is newly devised by the combination of the Taylor concept and Chimp Optimization Algorithm (ChOA). Initially, course review is done to find the optimal course, and thereafter feature extraction is performed for extracting the various significant features needed for further processing. Finally, sentiment classification is done using RMDL, which is trained by the proposed optimization algorithm, named ChOA. Thus, the positively reviewed courses are obtained from the classified sentiments for improving the course recommendation procedure. Extensive experiments are conducted using the E-Khool dataset and Coursera course dataset. Empirical results demonstrate that Taylor ChOA-based RMDL model significantly outperforms state-of-the-art methods for course recommendation tasks. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

23 pages, 2348 KiB

Open AccessArticle

Automatic Classification of National Health Service Feedback

by Christopher Haynes, Marco A. Palomino, Liz Stuart, David Viira, Frances Hannon, Gemma Crossingham and Kate Tantam

Mathematics 2022, 10(6), 983; https://doi.org/10.3390/math10060983 - 18 Mar 2022

Cited by 9 | Viewed by 2801

Abstract

Text datasets come in an abundance of shapes, sizes and styles. However, determining what factors limit classification accuracy remains a difficult task which is still the subject of intensive research. Using a challenging UK National Health Service (NHS) dataset, which contains many characteristics [...] Read more.

Text datasets come in an abundance of shapes, sizes and styles. However, determining what factors limit classification accuracy remains a difficult task which is still the subject of intensive research. Using a challenging UK National Health Service (NHS) dataset, which contains many characteristics known to increase the complexity of classification, we propose an innovative classification pipeline. This pipeline switches between different text pre-processing, scoring and classification techniques during execution. Using this flexible pipeline, a high level of accuracy has been achieved in the classification of a range of datasets, attaining a micro-averaged F1 score of 93.30% on the Reuters-21578 “ApteMod” corpus. An evaluation of this flexible pipeline was carried out using a variety of complex datasets compared against an unsupervised clustering approach. The paper describes how classification accuracy is impacted by an unbalanced category distribution, the rare use of generic terms and the subjective nature of manual human classification. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

24 pages, 8510 KiB

Open AccessArticle

Towards a Benchmarking System for Comparing Automatic Hate Speech Detection with an Intelligent Baseline Proposal

by Ștefan Dascălu and Florentina Hristea

Mathematics 2022, 10(6), 945; https://doi.org/10.3390/math10060945 - 16 Mar 2022

Cited by 6 | Viewed by 2810

Abstract

Hate Speech is a frequent problem occurring among Internet users. Recent regulations are being discussed by U.K. representatives (“Online Safety Bill”) and by the European Commission, which plans on introducing Hate Speech as an “EU crime”. The recent legislation having passed in order [...] Read more.

Hate Speech is a frequent problem occurring among Internet users. Recent regulations are being discussed by U.K. representatives (“Online Safety Bill”) and by the European Commission, which plans on introducing Hate Speech as an “EU crime”. The recent legislation having passed in order to combat this kind of speech places the burden of identification on the hosting websites and often within a tight time frame (24 h in France and Germany). These constraints make automatic Hate Speech detection a very important topic for major social media platforms. However, recent literature on Hate Speech detection lacks a benchmarking system that can evaluate how different approaches compare against each other regarding the prediction made concerning different types of text (short snippets such as those present on Twitter, as well as lengthier fragments). This paper intended to deal with this issue and to take a step forward towards the standardization of testing for this type of natural language processing (NLP) application. Furthermore, this paper explored different transformer and LSTM-based models in order to evaluate the performance of multi-task and transfer learning models used for Hate Speech detection. Some of the results obtained in this paper surpassed the existing ones. The paper concluded that transformer-based models have the best performance on all studied Datasets. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

14 pages, 268 KiB

Open AccessEditor’s ChoiceArticle

Intermediate-Task Transfer Learning with BERT for Sarcasm Detection

by Edoardo Savini and Cornelia Caragea

Mathematics 2022, 10(5), 844; https://doi.org/10.3390/math10050844 - 7 Mar 2022

Cited by 45 | Viewed by 6023

Abstract

Sarcasm detection plays an important role in natural language processing as it can impact the performance of many applications, including sentiment analysis, opinion mining, and stance detection. Despite substantial progress on sarcasm detection, the research results are scattered across datasets and studies. In [...] Read more.

Sarcasm detection plays an important role in natural language processing as it can impact the performance of many applications, including sentiment analysis, opinion mining, and stance detection. Despite substantial progress on sarcasm detection, the research results are scattered across datasets and studies. In this paper, we survey the current state-of-the-art and present strong baselines for sarcasm detection based on BERT pre-trained language models. We further improve our BERT models by fine-tuning them on related intermediate tasks before fine-tuning them on our target task. Specifically, relying on the correlation between sarcasm and (implied negative) sentiment and emotions, we explore a transfer learning framework that uses sentiment classification and emotion detection as individual intermediate tasks to infuse knowledge into the target task of sarcasm detection. Experimental results on three datasets that have different characteristics show that the BERT-based models outperform many previous models. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

27 pages, 1761 KiB

Open AccessArticle

Parallel Stylometric Document Embeddings with Deep Learning Based Language Models in Literary Authorship Attribution

by Mihailo Škorić, Ranka Stanković, Milica Ikonić Nešić, Joanna Byszuk and Maciej Eder

Mathematics 2022, 10(5), 838; https://doi.org/10.3390/math10050838 - 7 Mar 2022

Cited by 7 | Viewed by 4482

Abstract

This paper explores the effectiveness of parallel stylometric document embeddings in solving the authorship attribution task by testing a novel approach on literary texts in 7 different languages, totaling in 7051 unique 10,000-token chunks from 700 PoS and lemma annotated documents. We used [...] Read more.

This paper explores the effectiveness of parallel stylometric document embeddings in solving the authorship attribution task by testing a novel approach on literary texts in 7 different languages, totaling in 7051 unique 10,000-token chunks from 700 PoS and lemma annotated documents. We used these documents to produce four document embedding models using Stylo R package (word-based, lemma-based, PoS-trigrams-based, and PoS-mask-based) and one document embedding model using mBERT for each of the seven languages. We created further derivations of these embeddings in the form of average, product, minimum, maximum, and

l^{2}

norm of these document embedding matrices and tested them both including and excluding the mBERT-based document embeddings for each language. Finally, we trained several perceptrons on the portions of the dataset in order to procure adequate weights for a weighted combination approach. We tested standalone (two baselines) and composite embeddings for classification accuracy, precision, recall, weighted-average, and macro-averaged

F_{1}

-score, compared them with one another and have found that for each language most of our composition methods outperform the baselines (with a couple of methods outperforming all baselines for all languages), with or without mBERT inputs, which are found to have no significant positive impact on the results of our methods. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

24 pages, 1634 KiB

Open AccessArticle

Unsupervised and Supervised Methods to Estimate Temporal-Aware Contradictions in Online Course Reviews

by Ismail Badache, Adrian-Gabriel Chifu and Sébastien Fournier

Mathematics 2022, 10(5), 809; https://doi.org/10.3390/math10050809 - 3 Mar 2022

Cited by 2 | Viewed by 2743

Abstract

The analysis of user-generated content on the Internet has become increasingly popular for a wide variety of applications. One particular type of content is represented by the user reviews for programs, multimedia, products, and so on. Investigating the opinion contained by reviews may [...] Read more.

The analysis of user-generated content on the Internet has become increasingly popular for a wide variety of applications. One particular type of content is represented by the user reviews for programs, multimedia, products, and so on. Investigating the opinion contained by reviews may help in following the evolution of the reviewed items and thus in improving their quality. Detecting contradictory opinions in reviews is crucial when evaluating the quality of the respective resource. This article aims to estimate the contradiction intensity (strength) in the context of online courses (MOOC). This estimation was based on review ratings and on sentiment polarity in the comments, with respect to specific aspects, such as “lecturer”, “presentation”, etc. Between course sessions, users stop reviewing, and also, the course contents may evolve. Thus, the reviews are time dependent, and this is why they should be considered grouped by the course sessions. Having this in mind, the contribution of this paper is threefold: (a) defining the notion of subjective contradiction around specific aspects and then estimating its intensity based on sentiment polarity, review ratings, and temporality; (b) developing a dataset to evaluate the contradiction intensity measure, which was annotated based on a user study; (c) comparing our unsupervised method with supervised methods with automatic feature selection, over the dataset. The dataset collected from coursera.org is in English. It includes 2244 courses and 73,873 user-generated reviews of those courses.The results proved that the standard deviation of the ratings, the standard deviation of the polarities, and the number of reviews are suitable features for predicting the contradiction intensity classes. Among the supervised methods, the J48 decision trees algorithm yielded the best performance, compared to the naive Bayes model and the SVM model. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

9 pages, 471 KiB

Open AccessArticle

Cross-Lingual Transfer Learning for Arabic Task-Oriented Dialogue Systems Using Multilingual Transformer Model mT5

by Ahlam Fuad and Maha Al-Yahya

Mathematics 2022, 10(5), 746; https://doi.org/10.3390/math10050746 - 26 Feb 2022

Cited by 10 | Viewed by 2913

Abstract

Due to the promising performance of pre-trained language models for task-oriented dialogue systems (DS) in English, some efforts to provide multilingual models for task-oriented DS in low-resource languages have emerged. These efforts still face a long-standing challenge due to the lack of high-quality [...] Read more.

Due to the promising performance of pre-trained language models for task-oriented dialogue systems (DS) in English, some efforts to provide multilingual models for task-oriented DS in low-resource languages have emerged. These efforts still face a long-standing challenge due to the lack of high-quality data for these languages, especially Arabic. To circumvent the cost and time-intensive data collection and annotation, cross-lingual transfer learning can be used when few training data are available in the low-resource target language. Therefore, this study aims to explore the effectiveness of cross-lingual transfer learning in building an end-to-end Arabic task-oriented DS using the mT5 transformer model. We use the Arabic task-oriented dialogue dataset (Arabic-TOD) in the training and testing of the model. We present the cross-lingual transfer learning deployed with three different approaches: mSeq2Seq, Cross-lingual Pre-training (CPT), and Mixed-Language Pre-training (MLT). We obtain good results for our model compared to the literature for Chinese language using the same settings. Furthermore, cross-lingual transfer learning deployed with the MLT approach outperform the other two approaches. Finally, we show that our results can be improved by increasing the training dataset size. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

11 pages, 405 KiB

Open AccessArticle

Improving Machine Reading Comprehension with Multi-Task Learning and Self-Training

by Jianquan Ouyang and Mengen Fu

Mathematics 2022, 10(3), 310; https://doi.org/10.3390/math10030310 - 19 Jan 2022

Cited by 3 | Viewed by 4884

Abstract

Machine Reading Comprehension (MRC) is an AI challenge that requires machines to determine the correct answer to a question based on a given passage, in which extractive MRC requires extracting an answer span to a question from a given passage, such as the [...] Read more.

Machine Reading Comprehension (MRC) is an AI challenge that requires machines to determine the correct answer to a question based on a given passage, in which extractive MRC requires extracting an answer span to a question from a given passage, such as the task of span extraction. In contrast, non-extractive MRC infers answers from the content of reference passages, including Yes/No question answering to unanswerable questions. Due to the specificity of the two types of MRC tasks, researchers usually work on one type of task separately, but real-life application situations often require models that can handle many different types of tasks in parallel. Therefore, to meet the comprehensive requirements in such application situations, we construct a multi-task fusion training reading comprehension model based on the BERT pre-training model. The model uses the BERT pre-training model to obtain contextual representations, which is then shared by three downstream sub-modules for span extraction, Yes/No question answering, and unanswerable questions, next we fuse the outputs of the three sub-modules into a new span extraction output and use the fused cross-entropy loss function for global training. In the training phase, since our model requires a large amount of labeled training data, which is often expensive to obtain or unavailable in many tasks, we additionally use self-training to generate pseudo-labeled training data to train our model to improve its accuracy and generalization performance. We evaluated the SQuAD2.0 and CAIL2019 datasets. The experiments show that our model can efficiently handle different tasks. We achieved 83.2EM and 86.7F1 scores on the SQuAD2.0 dataset and 73.0EM and 85.3F1 scores on the CAIL2019 dataset. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

11 pages, 363 KiB

Open AccessArticle

Evaluating Research Trends from Journal Paper Metadata, Considering the Research Publication Latency

by Christian-Daniel Curiac, Ovidiu Banias and Mihai Micea

Mathematics 2022, 10(2), 233; https://doi.org/10.3390/math10020233 - 13 Jan 2022

Cited by 5 | Viewed by 2100

Abstract

Investigating the research trends within a scientific domain by analyzing semantic information extracted from scientific journals has been a topic of interest in the natural language processing (NLP) field. A research trend evaluation is generally based on the time evolution of the term [...] Read more.

Investigating the research trends within a scientific domain by analyzing semantic information extracted from scientific journals has been a topic of interest in the natural language processing (NLP) field. A research trend evaluation is generally based on the time evolution of the term occurrence or the term topic, but it neglects an important aspect—research publication latency. The average time lag between the research and its publication may vary from one month to more than one year, and it is a characteristic that may have significant impact when assessing research trends, mainly for rapidly evolving scientific areas. To cope with this problem, the present paper is the first work that explicitly considers research publication latency as a parameter in the trend evaluation process. Consequently, we provide a new trend detection methodology that mixes auto-ARIMA prediction with Mann–Kendall trend evaluations. The experimental results in an electronic design automation case study prove the viability of our approach. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

21 pages, 440 KiB

Open AccessArticle

Identifying the Structure of CSCL Conversations Using String Kernels

by Mihai Masala, Stefan Ruseti, Traian Rebedea, Mihai Dascalu, Gabriel Gutu-Robu and Stefan Trausan-Matu

Mathematics 2021, 9(24), 3330; https://doi.org/10.3390/math9243330 - 20 Dec 2021

Cited by 2 | Viewed by 2740

Abstract

Computer-Supported Collaborative Learning tools are exhibiting an increased popularity in education, as they allow multiple participants to easily communicate, share knowledge, solve problems collaboratively, or seek advice. Nevertheless, multi-participant conversation logs are often hard to follow by teachers due to the mixture of [...] Read more.

Computer-Supported Collaborative Learning tools are exhibiting an increased popularity in education, as they allow multiple participants to easily communicate, share knowledge, solve problems collaboratively, or seek advice. Nevertheless, multi-participant conversation logs are often hard to follow by teachers due to the mixture of multiple and many times concurrent discussion threads, with different interaction patterns between participants. Automated guidance can be provided with the help of Natural Language Processing techniques that target the identification of topic mixtures and of semantic links between utterances in order to adequately observe the debate and continuation of ideas. This paper introduces a method for discovering such semantic links embedded within chat conversations using string kernels, word embeddings, and neural networks. Our approach was validated on two datasets and obtained state-of-the-art results on both. Trained on a relatively small set of conversations, our models relying on string kernels are very effective for detecting such semantic links with a matching accuracy larger than 50% and represent a better alternative to complex deep neural networks, frequently employed in various Natural Language Processing tasks where large datasets are available. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

18 pages, 286 KiB

Open AccessArticle

Definition Extraction from Generic and Mathematical Domains with Deep Ensemble Learning

by Natalia Vanetik and Marina Litvak

Mathematics 2021, 9(19), 2502; https://doi.org/10.3390/math9192502 - 6 Oct 2021

Cited by 1 | Viewed by 2193

Abstract

Definitions are extremely important for efficient learning of new materials. In particular, mathematical definitions are necessary for understanding mathematics-related areas. Automated extraction of definitions could be very useful for automated indexing educational materials, building taxonomies of relevant concepts, and more. For definitions that [...] Read more.

Definitions are extremely important for efficient learning of new materials. In particular, mathematical definitions are necessary for understanding mathematics-related areas. Automated extraction of definitions could be very useful for automated indexing educational materials, building taxonomies of relevant concepts, and more. For definitions that are contained within a single sentence, this problem can be viewed as a binary classification of sentences into definitions and non-definitions. In this paper, we focus on automatic detection of one-sentence definitions in mathematical and general texts. We experiment with different classification models arranged in an ensemble and applied to a sentence representation containing syntactic and semantic information, to classify sentences. Our ensemble model is applied to the data adjusted with oversampling. Our experiments demonstrate the superiority of our approach over state-of-the-art methods in both general and mathematical domains. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

11 pages, 444 KiB

Open AccessArticle

To Batch or Not to Batch? Comparing Batching and Curriculum Learning Strategies across Tasks and Datasets

by Laura Burdick, Jonathan K. Kummerfeld and Rada Mihalcea

Mathematics 2021, 9(18), 2234; https://doi.org/10.3390/math9182234 - 11 Sep 2021

Cited by 2 | Viewed by 2160

Abstract

Many natural language processing architectures are greatly affected by seemingly small design decisions, such as batching and curriculum learning (how the training data are ordered during training). In order to better understand the impact of these decisions, we present a systematic analysis of [...] Read more.

Many natural language processing architectures are greatly affected by seemingly small design decisions, such as batching and curriculum learning (how the training data are ordered during training). In order to better understand the impact of these decisions, we present a systematic analysis of different curriculum learning strategies and different batching strategies. We consider multiple datasets for three tasks: text classification, sentence and phrase similarity, and part-of-speech tagging. Our experiments demonstrate that certain curriculum learning and batching decisions do increase performance substantially for some tasks. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

► Show Figures

Figure 1

Other

Jump to: Editorial, Research

7 pages, 195 KiB

Open AccessCorrection

Correction: Mothe, J. Analytics Methods to Understand Information Retrieval Effectiveness—A Survey. Mathematics 2022, 10, 2135

by Josiane Mothe

Mathematics 2022, 10(18), 3397; https://doi.org/10.3390/math10183397 - 19 Sep 2022

Viewed by 1082

Abstract

The author wishes to make the following corrections to this paper [1]:In Abstract, (1) “It depicts how data analytics has been used in IR for a better understanding system effectiveness” should be “It depicts how data analytics has been used in IR to [...] Read more.

The author wishes to make the following corrections to this paper [1]:In Abstract, (1) “It depicts how data analytics has been used in IR for a better understanding system effectiveness” should be “It depicts how data analytics has been used in IR to gain a better understanding of system effectiveness”; (2) “This review concludes lack of full understanding of system effectiveness according to the context although it has been possible to adapt the query processing to some contexts successfully” should be changed to “This review concludes that we lack a full understanding of system effectiveness related to the context which the system is in, though it has been possible to adapt the query processing to some contexts successfully”; (3) “This review also concludes that, even if it is possible to distinguish effective from non effective system on average on a query set” should be changed to “This review also concludes that, even if it is possible to distinguish effective from non-effective systems for a query set” [...] Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications)

Journal Menu

Journal Browser

Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (17 papers)

Editorial

Research

Other

Further Information

Guidelines

MDPI Initiatives

Follow MDPI