entropy-logo

Journal Browser

Journal Browser

Information Theory for Interpretable Machine Learning

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (15 May 2024) | Viewed by 12031

Special Issue Editors


E-Mail Website
Guest Editor
Computer Science Division, School of Science and Technology, University of Camerino, 62032 Camerino, Italy
Interests: complexity; topological data analysis; higher-order interactions; self-adaptive systems; deep learning; information theory; pattern recognition; interpretable machine learning; artificial intelligence; intelligent manufacturing; computer vision; signal processing; robotics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Mathematics, University of Patras, 265-00 Patras, Greece
Interests: artificial intelligence; machine learning; data mining; knowledge discovery; data science
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Machine learning (ML) and deep learning (DL) are increasingly being used in several fields, from physics to medicine, from social sciences to manufacturing. This massive use has led to the emergence of increasingly complex computational models, which are, in essence, black boxes. Can we trust these models to make important decisions for our lives? To answer this question, methods have been developed that use local, approximate models to explain the results of the main black-box models. Unfortunately, explanation for  black-boxes are often problematic and misleading.

What should be done instead is to focus on interpretable models that are not black-boxes, but true predictive models with particular characteristics. In deep neural networks, the most important example of black-box models, one strategy for creating interpretable models could be to make the flow of information through the network easier to understand by ensuring that groups of neurons always manipulate specific concepts (Disentanglement).

In this area, the contribution of Information Theory could be highly impactful.

This Special Issue aims to be a forum for the presentation of new and improved techniques of Information Theory for interpretable machine/deep learning and data science. The application of Information theory to all kinds of neural networks, in order to develop both supervised and unsupervised strategy for interpretability, as well as the application to unsupervised dimensionality reduction, fall within the scope of this Special Issue.

Dr. Marco Piangerelli
Dr. Sotiris Kotsiantis
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • interpretable machine learning
  • information disentanglement
  • information theory
  • reinforcement learning
  • supervised learning
  • unsupervised learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 2033 KiB  
Article
MV–MR: Multi-Views and Multi-Representations for Self-Supervised Learning and Knowledge Distillation
by Vitaliy Kinakh, Mariia Drozdova and Slava Voloshynovskiy
Entropy 2024, 26(6), 466; https://doi.org/10.3390/e26060466 - 29 May 2024
Viewed by 735
Abstract
We present a new method of self-supervised learning and knowledge distillation based on multi-views and multi-representations (MV–MR). MV–MR is based on the maximization of dependence between learnable embeddings from augmented and non-augmented views, jointly with the maximization of dependence between learnable embeddings from [...] Read more.
We present a new method of self-supervised learning and knowledge distillation based on multi-views and multi-representations (MV–MR). MV–MR is based on the maximization of dependence between learnable embeddings from augmented and non-augmented views, jointly with the maximization of dependence between learnable embeddings from the augmented view and multiple non-learnable representations from the non-augmented view. We show that the proposed method can be used for efficient self-supervised classification and model-agnostic knowledge distillation. Unlike other self-supervised techniques, our approach does not use any contrastive learning, clustering, or stop gradients. MV–MR is a generic framework allowing the incorporation of constraints on the learnable embeddings via the usage of image multi-representations as regularizers. The proposed method is used for knowledge distillation. MV–MR provides state-of-the-art self-supervised performance on the STL10 and CIFAR20 datasets in a linear evaluation setup. We show that a low-complexity ResNet50 model pretrained using proposed knowledge distillation based on the CLIP ViT model achieves state-of-the-art performance on STL10 and CIFAR100 datasets. Full article
(This article belongs to the Special Issue Information Theory for Interpretable Machine Learning)
Show Figures

Figure 1

29 pages, 5108 KiB  
Article
TURBO: The Swiss Knife of Auto-Encoders
by Guillaume Quétant, Yury Belousov, Vitaliy Kinakh and Slava Voloshynovskiy
Entropy 2023, 25(10), 1471; https://doi.org/10.3390/e25101471 - 21 Oct 2023
Cited by 3 | Viewed by 2090
Abstract
We present a novel information-theoretic framework, termed as TURBO, designed to systematically analyse and generalise auto-encoding methods. We start by examining the principles of information bottleneck and bottleneck-based networks in the auto-encoding setting and identifying their inherent limitations, which become more prominent for [...] Read more.
We present a novel information-theoretic framework, termed as TURBO, designed to systematically analyse and generalise auto-encoding methods. We start by examining the principles of information bottleneck and bottleneck-based networks in the auto-encoding setting and identifying their inherent limitations, which become more prominent for data with multiple relevant, physics-related representations. The TURBO framework is then introduced, providing a comprehensive derivation of its core concept consisting of the maximisation of mutual information between various data representations expressed in two directions reflecting the information flows. We illustrate that numerous prevalent neural network models are encompassed within this framework. The paper underscores the insufficiency of the information bottleneck concept in elucidating all such models, thereby establishing TURBO as a preferable theoretical reference. The introduction of TURBO contributes to a richer understanding of data representation and the structure of neural network models, enabling more efficient and versatile applications. Full article
(This article belongs to the Special Issue Information Theory for Interpretable Machine Learning)
Show Figures

Figure 1

23 pages, 1687 KiB  
Article
Reinforcement Learning-Based Decentralized Safety Control for Constrained Interconnected Nonlinear Safety-Critical Systems
by Chunbin Qin, Yinliang Wu, Jishi Zhang and Tianzeng Zhu
Entropy 2023, 25(8), 1158; https://doi.org/10.3390/e25081158 - 2 Aug 2023
Cited by 1 | Viewed by 1195
Abstract
This paper addresses the problem of decentralized safety control (DSC) of constrained interconnected nonlinear safety-critical systems under reinforcement learning strategies, where asymmetric input constraints and security constraints are considered. To begin with, improved performance functions associated with the actuator estimates for each auxiliary [...] Read more.
This paper addresses the problem of decentralized safety control (DSC) of constrained interconnected nonlinear safety-critical systems under reinforcement learning strategies, where asymmetric input constraints and security constraints are considered. To begin with, improved performance functions associated with the actuator estimates for each auxiliary subsystem are constructed. Then, the decentralized control problem with security constraints and asymmetric input constraints is transformed into an equivalent decentralized control problem with asymmetric input constraints using the barrier function. This approach ensures that safety-critical systems operate and learn optimal DSC policies within their safe global domains. Then, the optimal control strategy is shown to ensure that the entire system is uniformly ultimately bounded (UUB). In addition, all signals in the closed-loop auxiliary subsystem, based on Lyapunov theory, are uniformly ultimately bounded, and the effectiveness of the designed method is verified by practical simulation. Full article
(This article belongs to the Special Issue Information Theory for Interpretable Machine Learning)
Show Figures

Figure 1

18 pages, 15138 KiB  
Article
Wasserstein Distance-Based Deep Leakage from Gradients
by Zifan Wang, Changgen Peng, Xing He and Weijie Tan
Entropy 2023, 25(5), 810; https://doi.org/10.3390/e25050810 - 17 May 2023
Cited by 1 | Viewed by 2209
Abstract
Federated learning protects the privacy information in the data set by sharing the average gradient. However, “Deep Leakage from Gradient” (DLG) algorithm as a gradient-based feature reconstruction attack can recover privacy training data using gradients shared in federated learning, resulting in private information [...] Read more.
Federated learning protects the privacy information in the data set by sharing the average gradient. However, “Deep Leakage from Gradient” (DLG) algorithm as a gradient-based feature reconstruction attack can recover privacy training data using gradients shared in federated learning, resulting in private information leakage. However, the algorithm has the disadvantages of slow model convergence and poor inverse generated images accuracy. To address these issues, a Wasserstein distance-based DLG method is proposed, named WDLG. The WDLG method uses Wasserstein distance as the training loss function achieved to improve the inverse image quality and the model convergence. The hard-to-calculate Wasserstein distance is converted to be calculated iteratively using the Lipschit condition and Kantorovich–Rubinstein duality. Theoretical analysis proves the differentiability and continuity of Wasserstein distance. Finally, experiment results show that the WDLG algorithm is superior to DLG in training speed and inversion image quality. At the same time, we prove through the experiments that differential privacy can be used for disturbance protection, which provides some ideas for the development of a deep learning framework to protect privacy. Full article
(This article belongs to the Special Issue Information Theory for Interpretable Machine Learning)
Show Figures

Figure 1

18 pages, 2717 KiB  
Article
Detecting Information Relays in Deep Neural Networks
by Arend Hintze and Christoph Adami
Entropy 2023, 25(3), 401; https://doi.org/10.3390/e25030401 - 22 Feb 2023
Cited by 2 | Viewed by 4420
Abstract
Deep learning of artificial neural networks (ANNs) is creating highly functional processes that are, unfortunately, nearly as hard to interpret as their biological counterparts. Identification of functional modules in natural brains plays an important role in cognitive and neuroscience alike, and can be [...] Read more.
Deep learning of artificial neural networks (ANNs) is creating highly functional processes that are, unfortunately, nearly as hard to interpret as their biological counterparts. Identification of functional modules in natural brains plays an important role in cognitive and neuroscience alike, and can be carried out using a wide range of technologies such as fMRI, EEG/ERP, MEG, or calcium imaging. However, we do not have such robust methods at our disposal when it comes to understanding functional modules in artificial neural networks. Ideally, understanding which parts of an artificial neural network perform what function might help us to address a number of vexing problems in ANN research, such as catastrophic forgetting and overfitting. Furthermore, revealing a network’s modularity could improve our trust in them by making these black boxes more transparent. Here, we introduce a new information-theoretic concept that proves useful in understanding and analyzing a network’s functional modularity: the relay information IR. The relay information measures how much information groups of neurons that participate in a particular function (modules) relay from inputs to outputs. Combined with a greedy search algorithm, relay information can be used to identify computational modules in neural networks. We also show that the functionality of modules correlates with the amount of relay information they carry. Full article
(This article belongs to the Special Issue Information Theory for Interpretable Machine Learning)
Show Figures

Figure 1

Back to TopTop