entropy-logo

Journal Browser

Journal Browser

Entropy in Real-World Datasets and Its Impact on Machine Learning II

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Multidisciplinary Applications".

Deadline for manuscript submissions: 31 March 2025 | Viewed by 5401

Special Issue Editors


E-Mail Website
Guest Editor
1. Department of Knowledge Engineering, University of Economics, 1 Maja 50, 40-287 Katowice, Poland
2. Łukasiewicz Research Network - Institute of Innovative Technologies EMAG, 40-189 Katowice, Poland
Interests: machine learning; ensemble methods; decision trees; ant colony optimization; computational intelligence; data analysis; optimization
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Łukasiewicz Research Network - Institute of Innovative Technologies EMAG, 40-189 Katowice, Poland
Interests: cyber security; artificial intelligence; data security; automation; electric power engineering

E-Mail Website
Guest Editor
Department of Machine Learning, University of Economics, 1 Maja 50, 40-287 Katowice, Poland
Interests: machine learning; natural language processing; social networks; artificial intelligence; fake news detection

Special Issue Information

Dear Colleagues,

Today, data science and machine learning remain pivotal pillars for solving the most intricate real-world challenges. Their versatility and utility span across various domains, including medicine, finance, text mining, image analysis, and more. Simultaneously, the abundance of user-accessible data continues to escalate, with concepts like big data and data streams garnering ever-increasing recognition. However, traditional classification methods may exhibit questionable efficacy in handling such data complexities, thereby necessitating continuous advancements in machine learning methodologies.

The second edition of this special session centers on the intricacies of real-world data and the impact of entropy on machine learning algorithms. Our particular focus lies on novel classification algorithms that harness the power of data science to model and process diverse real-world datasets effectively.

We welcome researchers to submit their original work, showcasing innovative approaches to data classification and analysis of real-world datasets, with a keen emphasis on entropy and its influence on machine learning effectiveness. We aim to foster a collaborative platform for exchanging knowledge and experiences related to cutting-edge developments in data science and data classification.

Prof. Dr. Jan Kozak
Dr. Artur Kozłowski
Dr. Barbara Probierz
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • real-world datasets
  • data science
  • machine learning algorithms
  • optimization
  • classification
  • prediction methods
  • entropy in big data

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Related Special Issue

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 1438 KiB  
Article
An Empirical Study of Self-Supervised Learning with Wasserstein Distance
by Makoto Yamada, Yuki Takezawa, Guillaume Houry, Kira Michaela Düsterwald, Deborah Sulem, Han Zhao and Yao-Hung Tsai
Entropy 2024, 26(11), 939; https://doi.org/10.3390/e26110939 - 31 Oct 2024
Cited by 1 | Viewed by 939
Abstract
In this study, we consider the problem of self-supervised learning (SSL) utilizing the 1-Wasserstein distance on a tree structure (a.k.a., Tree-Wasserstein distance (TWD)), where TWD is defined as the L1 distance between two tree-embedded vectors. In SSL methods, the cosine similarity is often [...] Read more.
In this study, we consider the problem of self-supervised learning (SSL) utilizing the 1-Wasserstein distance on a tree structure (a.k.a., Tree-Wasserstein distance (TWD)), where TWD is defined as the L1 distance between two tree-embedded vectors. In SSL methods, the cosine similarity is often utilized as an objective function; however, it has not been well studied when utilizing the Wasserstein distance. Training the Wasserstein distance is numerically challenging. Thus, this study empirically investigates a strategy for optimizing the SSL with the Wasserstein distance and finds a stable training procedure. More specifically, we evaluate the combination of two types of TWD (total variation and ClusterTree) and several probability models, including the softmax function, the ArcFace probability model, and simplicial embedding. We propose a simple yet effective Jeffrey divergence-based regularization method to stabilize optimization. Through empirical experiments on STL10, CIFAR10, CIFAR100, and SVHN, we find that a simple combination of the softmax function and TWD can obtain significantly lower results than the standard SimCLR. Moreover, a simple combination of TWD and SimSiam fails to train the model. We find that the model performance depends on the combination of TWD and probability model, and that the Jeffrey divergence regularization helps in model training. Finally, we show that the appropriate combination of the TWD and probability model outperforms cosine similarity-based representation learning. Full article
(This article belongs to the Special Issue Entropy in Real-World Datasets and Its Impact on Machine Learning II)
Show Figures

Figure 1

18 pages, 11924 KiB  
Article
Boxing Punch Detection with Single Static Camera
by Piotr Stefański, Jan Kozak and Tomasz Jach
Entropy 2024, 26(8), 617; https://doi.org/10.3390/e26080617 - 23 Jul 2024
Viewed by 1458
Abstract
Computer vision in sports analytics is gaining in popularity. Monitoring players’ performance using cameras is more flexible and does not interfere with player equipment compared to systems using sensors. This provides a wide set of opportunities for computer vision systems that help coaches, [...] Read more.
Computer vision in sports analytics is gaining in popularity. Monitoring players’ performance using cameras is more flexible and does not interfere with player equipment compared to systems using sensors. This provides a wide set of opportunities for computer vision systems that help coaches, reporters, and audiences. This paper provides an introduction to the problem of measuring boxers’ performance, with a comprehensive survey of approaches in current science. The main goal of the paper is to provide a system to automatically detect punches in Olympic boxing using a single static camera. The authors use Euclidean distance to measure the distance between boxers and convolutional neural networks to classify footage frames. In order to improve classification performance, we provide and test three approaches to manipulating the images prior to fitting the classifier. The proposed solution achieves 95% balanced accuracy, 49% F1 score for frames with punches, and 97% for frames without punches. Finally, we present a working system for analyses of a boxing scene that marks boxers and labelled frames with detected clashes and punches. Full article
(This article belongs to the Special Issue Entropy in Real-World Datasets and Its Impact on Machine Learning II)
Show Figures

Figure 1

21 pages, 5009 KiB  
Article
A Novel Classification Method: Neighborhood-Based Positive Unlabeled Learning Using Decision Tree (NPULUD)
by Bita Ghasemkhani, Kadriye Filiz Balbal, Kokten Ulas Birant and Derya Birant
Entropy 2024, 26(5), 403; https://doi.org/10.3390/e26050403 - 4 May 2024
Cited by 3 | Viewed by 2289
Abstract
In a standard binary supervised classification task, the existence of both negative and positive samples in the training dataset are required to construct a classification model. However, this condition is not met in certain applications where only one class of samples is obtainable. [...] Read more.
In a standard binary supervised classification task, the existence of both negative and positive samples in the training dataset are required to construct a classification model. However, this condition is not met in certain applications where only one class of samples is obtainable. To overcome this problem, a different classification method, which learns from positive and unlabeled (PU) data, must be incorporated. In this study, a novel method is presented: neighborhood-based positive unlabeled learning using decision tree (NPULUD). First, NPULUD uses the nearest neighborhood approach for the PU strategy and then employs a decision tree algorithm for the classification task by utilizing the entropy measure. Entropy played a pivotal role in assessing the level of uncertainty in the training dataset, as a decision tree was developed with the purpose of classification. Through experiments, we validated our method over 24 real-world datasets. The proposed method attained an average accuracy of 87.24%, while the traditional supervised learning approach obtained an average accuracy of 83.99% on the datasets. Additionally, it is also demonstrated that our method obtained a statistically notable enhancement (7.74%), with respect to state-of-the-art peers, on average. Full article
(This article belongs to the Special Issue Entropy in Real-World Datasets and Its Impact on Machine Learning II)
Show Figures

Figure 1

Back to TopTop