Next Issue
Volume 3, June
Previous Issue
Volume 2, December
 
 

Mach. Learn. Knowl. Extr., Volume 3, Issue 1 (March 2021) – 14 articles

Cover Story (view full-size image): We focus on the main challenges in AI system engineering along the development cycle of machine learning systems including lessons learned from past and ongoing research. This will be done by taking into account intrinsic conditions of deep learning models, data and software quality issues, and human-centered artificial intelligence (AI) postulates, including confidentiality and ethical aspects. The analysis outlines a fundamental theory-practice gap that superimposes the challenges of AI system engineering at the level of data quality assurance, model building, software engineering, and deployment. The aim of this paper is to pinpoint research topics relevant for AI system engineering by exploring approaches and challenges particularly posed by data quality assurance, embedded AI, confidentiality-preserving transfer learning, human–AI teaming, and ethics by design. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
15 pages, 921 KiB  
Perspective
From the Digital Data Revolution toward a Digital Society: Pervasiveness of Artificial Intelligence
by Frank Emmert-Streib
Mach. Learn. Knowl. Extr. 2021, 3(1), 284-298; https://doi.org/10.3390/make3010014 - 4 Mar 2021
Cited by 21 | Viewed by 6347
Abstract
Technological progress has led to powerful computers and communication technologies that penetrate nowadays all areas of science, industry and our private lives. As a consequence, all these areas are generating digital traces of data amounting to big data resources. This opens unprecedented opportunities [...] Read more.
Technological progress has led to powerful computers and communication technologies that penetrate nowadays all areas of science, industry and our private lives. As a consequence, all these areas are generating digital traces of data amounting to big data resources. This opens unprecedented opportunities but also challenges toward the analysis, management, interpretation and responsible usage of such data. In this paper, we discuss these developments and the fields that have been particularly effected by the digital revolution. Our discussion is AI-centered showing domain-specific prospects but also intricacies for the method development in artificial intelligence. For instance, we discuss recent breakthroughs in deep learning algorithms and artificial intelligence as well as advances in text mining and natural language processing, e.g., word-embedding methods that enable the processing of large amounts of text data from diverse sources such as governmental reports, blog entries in social media or clinical health records of patients. Furthermore, we discuss the necessity of further improving general artificial intelligence approaches and for utilizing advanced learning paradigms. This leads to arguments for the establishment of statistical artificial intelligence. Finally, we provide an outlook on important aspects of future challenges that are of crucial importance for the development of all fields, including ethical AI and the influence of bias on AI systems. As potential end-point of this development, we define digital society as the asymptotic limiting state of digital economy that emerges from fully connected information and communication technologies enabling the pervasiveness of AI. Overall, our discussion provides a perspective on the elaborate relatedness of digital data and AI systems. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

21 pages, 2004 KiB  
Article
Leaving No Stone Unturned: Flexible Retrieval of Idiomatic Expressions from a Large Text Corpus
by Callum Hughes, Maxim Filimonov, Alison Wray and Irena Spasić
Mach. Learn. Knowl. Extr. 2021, 3(1), 263-283; https://doi.org/10.3390/make3010013 - 3 Mar 2021
Cited by 5 | Viewed by 4227
Abstract
Idioms are multi-word expressions whose meaning cannot always be deduced from the literal meaning of constituent words. A key feature of idioms that is central to this paper is their peculiar mixture of fixedness and variability, which poses challenges for their retrieval from [...] Read more.
Idioms are multi-word expressions whose meaning cannot always be deduced from the literal meaning of constituent words. A key feature of idioms that is central to this paper is their peculiar mixture of fixedness and variability, which poses challenges for their retrieval from large corpora using traditional search approaches. These challenges hinder insights into idiom usage, affecting users who are conducting linguistic research as well as those involved in language education. To facilitate access to idiom examples taken from real-world contexts, we introduce an information retrieval system designed specifically for idioms. Given a search query that represents an idiom, typically in its canonical form, the system expands it automatically to account for the most common types of idiom variation including inflection, open slots, adjectival or adverbial modification and passivisation. As a by-product of query expansion, other types of idiom variation captured include derivation, compounding, negation, distribution across multiple clauses as well as other unforeseen types of variation. The system was implemented on top of Elasticsearch, an open-source, distributed, scalable, real-time search engine. Flexible retrieval of idioms is supported by a combination of linguistic pre-processing of the search queries, their translation into a set of query clauses written in a query language called Query DSL, and analysis, an indexing process that involves tokenisation and normalisation. Our system outperformed the phrase search in terms of recall and outperformed the keyword search in terms of precision. Out of the three, our approach was found to provide the best balance between precision and recall. By providing a fast and easy way of finding idioms in large corpora, our approach can facilitate further developments in fields such as linguistics, language education and natural language processing. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

20 pages, 13641 KiB  
Article
Automatic Feature Selection for Improved Interpretability on Whole Slide Imaging
by Antoine Pirovano, Hippolyte Heuberger, Sylvain Berlemont, SaÏd Ladjal and Isabelle Bloch
Mach. Learn. Knowl. Extr. 2021, 3(1), 243-262; https://doi.org/10.3390/make3010012 - 22 Feb 2021
Cited by 4 | Viewed by 4485
Abstract
Deep learning methods are widely used for medical applications to assist medical doctors in their daily routine. While performances reach expert’s level, interpretability (highlighting how and what a trained model learned and why it makes a specific decision) is the next important challenge [...] Read more.
Deep learning methods are widely used for medical applications to assist medical doctors in their daily routine. While performances reach expert’s level, interpretability (highlighting how and what a trained model learned and why it makes a specific decision) is the next important challenge that deep learning methods need to answer to be fully integrated in the medical field. In this paper, we address the question of interpretability in the context of whole slide images (WSI) classification with the formalization of the design of WSI classification architectures and propose a piece-wise interpretability approach, relying on gradient-based methods, feature visualization and multiple instance learning context. After training two WSI classification architectures on Camelyon-16 WSI dataset, highlighting discriminative features learned, and validating our approach with pathologists, we propose a novel manner of computing interpretability slide-level heat-maps, based on the extracted features, that improves tile-level classification performances. We measure the improvement using the tile-level AUC that we called Localization AUC, and show an improvement of more than 0.2. We also validate our results with a RemOve And Retrain (ROAR) measure. Then, after studying the impact of the number of features used for heat-map computation, we propose a corrective approach, relying on activation colocalization of selected features, that improves the performances and the stability of our proposed method. Full article
Show Figures

Figure 1

15 pages, 7734 KiB  
Article
A Combined Short Time Fourier Transform and Image Classification Transformer Model for Rolling Element Bearings Fault Diagnosis in Electric Motors
by Christos T. Alexakos, Yannis L. Karnavas, Maria Drakaki and Ioannis A. Tziafettas
Mach. Learn. Knowl. Extr. 2021, 3(1), 228-242; https://doi.org/10.3390/make3010011 - 16 Feb 2021
Cited by 29 | Viewed by 6065
Abstract
The most frequent faults in rotating electrical machines occur in their rolling element bearings. Thus, an effective health diagnosis mechanism of rolling element bearings is necessary from operational and economical points of view. Recently, convolutional neural networks (CNNs) have been proposed for bearing [...] Read more.
The most frequent faults in rotating electrical machines occur in their rolling element bearings. Thus, an effective health diagnosis mechanism of rolling element bearings is necessary from operational and economical points of view. Recently, convolutional neural networks (CNNs) have been proposed for bearing fault detection and identification. However, two major drawbacks of these models are (a) their lack of ability to capture global information about the input vector and to derive knowledge about the statistical properties of the latter and (b) the high demand for computational resources. In this paper, short time Fourier transform (STFT) is proposed as a pre-processing step to acquire time-frequency representation vibration images from raw data in variable healthy or faulty conditions. To diagnose and classify the vibration images, the image classification transformer (ICT), inspired from the transformers used for natural language processing, has been suitably adapted to work as an image classifier trained in a supervised manner and is also proposed as an alternative method to CNNs. Simulation results on a famous and well-established rolling element bearing fault detection benchmark show the effectiveness of the proposed method, which achieved 98.3% accuracy (on the test dataset) while requiring substantially fewer computational resources to be trained compared to the CNN approach. Full article
(This article belongs to the Section Learning)
Show Figures

Graphical abstract

23 pages, 981 KiB  
Article
Property Checking with Interpretable Error Characterization for Recurrent Neural Networks
by Franz Mayr, Sergio Yovine and Ramiro Visca
Mach. Learn. Knowl. Extr. 2021, 3(1), 205-227; https://doi.org/10.3390/make3010010 - 12 Feb 2021
Cited by 10 | Viewed by 3418
Abstract
This paper presents a novel on-the-fly, black-box, property-checking through learning approach as a means for verifying requirements of recurrent neural networks (RNN) in the context of sequence classification. Our technique steps on a tool for learning probably approximately correct (PAC) deterministic finite automata [...] Read more.
This paper presents a novel on-the-fly, black-box, property-checking through learning approach as a means for verifying requirements of recurrent neural networks (RNN) in the context of sequence classification. Our technique steps on a tool for learning probably approximately correct (PAC) deterministic finite automata (DFA). The sequence classifier inside the black-box consists of a Boolean combination of several components, including the RNN under analysis together with requirements to be checked, possibly modeled as RNN themselves. On one hand, if the output of the algorithm is an empty DFA, there is a proven upper bound (as a function of the algorithm parameters) on the probability of the language of the black-box to be nonempty. This implies the property probably holds on the RNN with probabilistic guarantees. On the other, if the DFA is nonempty, it is certain that the language of the black-box is nonempty. This entails the RNN does not satisfy the requirement for sure. In this case, the output automaton serves as an explicit and interpretable characterization of the error. Our approach does not rely on a specific property specification formalism and is capable of handling nonregular languages as well. Besides, it neither explicitly builds individual representations of any of the components of the black-box nor resorts to any external decision procedure for verification. This paper also improves previous theoretical results regarding the probabilistic guarantees of the underlying learning algorithm. Full article
(This article belongs to the Special Issue Selected Papers from CD-MAKE 2020 and ARES 2020)
Show Figures

Figure 1

35 pages, 7799 KiB  
Article
Explainable AI Framework for Multivariate Hydrochemical Time Series
by Michael C. Thrun, Alfred Ultsch and Lutz Breuer
Mach. Learn. Knowl. Extr. 2021, 3(1), 170-204; https://doi.org/10.3390/make3010009 - 4 Feb 2021
Cited by 22 | Viewed by 5967
Abstract
The understanding of water quality and its underlying processes is important for the protection of aquatic environments. With the rare opportunity of access to a domain expert, an explainable AI (XAI) framework is proposed that is applicable to multivariate time series. The XAI [...] Read more.
The understanding of water quality and its underlying processes is important for the protection of aquatic environments. With the rare opportunity of access to a domain expert, an explainable AI (XAI) framework is proposed that is applicable to multivariate time series. The XAI provides explanations that are interpretable by domain experts. In three steps, it combines a data-driven choice of a distance measure with supervised decision trees guided by projection-based clustering. The multivariate time series consists of water quality measurements, including nitrate, electrical conductivity, and twelve other environmental parameters. The relationships between water quality and the environmental parameters are investigated by identifying similar days within a cluster and dissimilar days between clusters. The framework, called DDS-XAI, does not depend on prior knowledge about data structure, and its explanations are tendentially contrastive. The relationships in the data can be visualized by a topographic map representing high-dimensional structures. Two state of the art XAIs called eUD3.5 and iterative mistake minimization (IMM) were unable to provide meaningful and relevant explanations from the three multivariate time series data. The DDS-XAI framework can be swiftly applied to new data. Open-source code in R for all steps of the XAI framework is provided and the steps are structured application-oriented. Full article
Show Figures

Figure 1

2 pages, 167 KiB  
Editorial
Acknowledgment to Reviewers of MAKE in 2020
by MAKE Editorial Office
Mach. Learn. Knowl. Extr. 2021, 3(1), 168-169; https://doi.org/10.3390/make3010008 - 27 Jan 2021
Viewed by 2071
Abstract
Peer review is the driving force of journal development, and reviewers are gatekeepers who ensure that MAKE maintains its standards for the high quality of its published papers [...] Full article
49 pages, 5457 KiB  
Article
Interpretable Topic Extraction and Word Embedding Learning Using Non-Negative Tensor DEDICOM
by Lars Hillebrand, David Biesner, Christian Bauckhage and Rafet Sifa
Mach. Learn. Knowl. Extr. 2021, 3(1), 123-167; https://doi.org/10.3390/make3010007 - 19 Jan 2021
Cited by 3 | Viewed by 3857
Abstract
Unsupervised topic extraction is a vital step in automatically extracting concise contentual information from large text corpora. Existing topic extraction methods lack the capability of linking relations between these topics which would further help text understanding. Therefore we propose utilizing the Decomposition into [...] Read more.
Unsupervised topic extraction is a vital step in automatically extracting concise contentual information from large text corpora. Existing topic extraction methods lack the capability of linking relations between these topics which would further help text understanding. Therefore we propose utilizing the Decomposition into Directional Components (DEDICOM) algorithm which provides a uniquely interpretable matrix factorization for symmetric and asymmetric square matrices and tensors. We constrain DEDICOM to row-stochasticity and non-negativity in order to factorize pointwise mutual information matrices and tensors of text corpora. We identify latent topic clusters and their relations within the vocabulary and simultaneously learn interpretable word embeddings. Further, we introduce multiple methods based on alternating gradient descent to efficiently train constrained DEDICOM algorithms. We evaluate the qualitative topic modeling and word embedding performance of our proposed methods on several datasets, including a novel New York Times news dataset, and demonstrate how the DEDICOM algorithm provides deeper text analysis than competing matrix factorization approaches. Full article
(This article belongs to the Special Issue Selected Papers from CD-MAKE 2020 and ARES 2020)
Show Figures

Figure 1

28 pages, 2162 KiB  
Article
Learning DOM Trees of Web Pages by Subpath Kernel and Detecting Fake e-Commerce Sites
by Kilho Shin, Taichi Ishikawa, Yu-Lu Liu and David Lawrence Shepard
Mach. Learn. Knowl. Extr. 2021, 3(1), 95-122; https://doi.org/10.3390/make3010006 - 14 Jan 2021
Cited by 8 | Viewed by 4854
Abstract
The subpath kernel is a class of positive definite kernels defined over trees, which has the following advantages for the purposes of classification, regression and clustering: it can be incorporated into a variety of powerful kernel machines including SVM; It is invariant whether [...] Read more.
The subpath kernel is a class of positive definite kernels defined over trees, which has the following advantages for the purposes of classification, regression and clustering: it can be incorporated into a variety of powerful kernel machines including SVM; It is invariant whether input trees are ordered or unordered; It can be computed by significantly fast linear-time algorithms; And, finally, its excellent learning performance has been proven through intensive experiments in the literature. In this paper, we leverage recent advances in tree kernels to solve real problems. As an example, we apply our method to the problem of detecting fake e-commerce sites. Although the problem is similar to phishing site detection, the fact that mimicking existing authentic sites is harmful for fake e-commerce sites marks a clear difference between these two problems. We focus on fake e-commerce site detection for three reasons: e-commerce fraud is a real problem that companies and law enforcement have been cooperating to solve; Inefficiency hampers existing approaches because datasets tend to be large, while subpath kernel learning overcomes these performance challenges; And we offer increased resiliency against attempts to subvert existing detection methods through incorporating robust features that adversaries cannot change: the DOM-trees of web-sites. Our real-world results are remarkable: our method has exhibited accuracy as high as 0.998 when training SVM with 1000 instances and evaluating accuracy for almost 7000 independent instances. Its generalization efficiency is also excellent: with only 100 training instances, the accuracy score reached 0.996. Full article
(This article belongs to the Special Issue Selected Papers from CD-MAKE 2020 and ARES 2020)
Show Figures

Figure 1

11 pages, 886 KiB  
Article
Rumor Detection Based on SAGNN: Simplified Aggregation Graph Neural Networks
by Liang Zhang, Jingqun Li, Bin Zhou and Yan Jia
Mach. Learn. Knowl. Extr. 2021, 3(1), 84-94; https://doi.org/10.3390/make3010005 - 4 Jan 2021
Cited by 10 | Viewed by 5372
Abstract
Identifying fake news on media has been an important issue. This is especially true considering the wide spread of rumors on popular social networks such as Twitter. Various kinds of techniques have been proposed for automatic rumor detection. In this work, we study [...] Read more.
Identifying fake news on media has been an important issue. This is especially true considering the wide spread of rumors on popular social networks such as Twitter. Various kinds of techniques have been proposed for automatic rumor detection. In this work, we study the application of graph neural networks for rumor classification at a lower level, instead of applying existing neural network architectures to detect rumors. The responses to true rumors and false rumors display distinct characteristics. This suggests that it is essential to capture such interactions in an effective manner for a deep learning network to achieve better rumor detection performance. To this end we present a simplified aggregation graph neural network architecture. Experiments on publicly available Twitter datasets demonstrate that the proposed network has performance on a par with or even better than that of state-of-the-art graph convolutional networks, while significantly reducing the computational complexity. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

28 pages, 3799 KiB  
Review
AI System Engineering—Key Challenges and Lessons Learned
by Lukas Fischer, Lisa Ehrlinger, Verena Geist, Rudolf Ramler, Florian Sobiezky, Werner Zellinger, David Brunner, Mohit Kumar and Bernhard Moser
Mach. Learn. Knowl. Extr. 2021, 3(1), 56-83; https://doi.org/10.3390/make3010004 - 31 Dec 2020
Cited by 29 | Viewed by 18735
Abstract
The main challenges are discussed together with the lessons learned from past and ongoing research along the development cycle of machine learning systems. This will be done by taking into account intrinsic conditions of nowadays deep learning models, data and software quality issues [...] Read more.
The main challenges are discussed together with the lessons learned from past and ongoing research along the development cycle of machine learning systems. This will be done by taking into account intrinsic conditions of nowadays deep learning models, data and software quality issues and human-centered artificial intelligence (AI) postulates, including confidentiality and ethical aspects. The analysis outlines a fundamental theory-practice gap which superimposes the challenges of AI system engineering at the level of data quality assurance, model building, software engineering and deployment. The aim of this paper is to pinpoint research topics to explore approaches to address these challenges. Full article
(This article belongs to the Special Issue Selected Papers from CD-MAKE 2020 and ARES 2020)
Show Figures

Figure 1

22 pages, 2506 KiB  
Article
Robust Learning with Implicit Residual Networks
by Viktor Reshniak and Clayton G. Webster
Mach. Learn. Knowl. Extr. 2021, 3(1), 34-55; https://doi.org/10.3390/make3010003 - 31 Dec 2020
Cited by 5 | Viewed by 3411
Abstract
In this effort, we propose a new deep architecture utilizing residual blocks inspired by implicit discretization schemes. As opposed to the standard feed-forward networks, the outputs of the proposed implicit residual blocks are defined as the fixed points of the appropriately chosen nonlinear [...] Read more.
In this effort, we propose a new deep architecture utilizing residual blocks inspired by implicit discretization schemes. As opposed to the standard feed-forward networks, the outputs of the proposed implicit residual blocks are defined as the fixed points of the appropriately chosen nonlinear transformations. We show that this choice leads to the improved stability of both forward and backward propagations, has a favorable impact on the generalization power, and allows for control the robustness of the network with only a few hyperparameters. In addition, the proposed reformulation of ResNet does not introduce new parameters and can potentially lead to a reduction in the number of required layers due to improved forward stability. Finally, we derive the memory-efficient training algorithm, propose a stochastic regularization technique, and provide numerical results in support of our findings. Full article
(This article belongs to the Special Issue Explainable Machine Learning)
Show Figures

Figure 1

20 pages, 6114 KiB  
Article
Understand Daily Fire Suppression Resource Ordering and Assignment Patterns by Unsupervised Learning
by Yu Wei, Matthew P. Thompson, Erin J. Belval, David E. Calkin and Jude Bayham
Mach. Learn. Knowl. Extr. 2021, 3(1), 14-33; https://doi.org/10.3390/make3010002 - 23 Dec 2020
Cited by 5 | Viewed by 3076
Abstract
Wildland fire management agencies are responsible for assigning suppression resources to control fire spread and mitigate fire risks. This study implements a principle component analysis and an association rule analysis to study wildland fire response resource requests from 2016 to 2018 in the [...] Read more.
Wildland fire management agencies are responsible for assigning suppression resources to control fire spread and mitigate fire risks. This study implements a principle component analysis and an association rule analysis to study wildland fire response resource requests from 2016 to 2018 in the western US to identify daily resource ordering and assignment patterns for large fire incidents. Unsupervised learning can identify patterns in the assignment of individual resources or pairs of resources. Three national Geographic Area Coordination Centers (GACCs) are studied, including California (CA), Rocky Mountain (RMC), and Southwest (SWC) at both high and low suppression preparedness levels (PLs). Substantial differences are found in resource ordering and assignment between GACCs. For example, in comparison with RMC and SWC, CA generally orders and dispatches more resources to a fire per day; CA also likely orders and assigns multiple resource types in combination. Resources are more likely assigned to a fire at higher PLs in all GACCs. This study also suggests several future research directions including studying the causal relations behind different resource ordering and assignment patterns in different regions. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

13 pages, 657 KiB  
Article
The Predictive Value of Data from Virtual Investment Communities
by Benjamin M. Abdel-Karim, Alexander Benlian and Oliver Hinz
Mach. Learn. Knowl. Extr. 2021, 3(1), 1-13; https://doi.org/10.3390/make3010001 - 23 Dec 2020
Cited by 1 | Viewed by 2961
Abstract
Optimal investment decisions by institutional investors require accurate predictions with respect to the development of stock markets. Motivated by previous research that revealed the unsatisfactory performance of existing stock market prediction models, this study proposes a novel prediction approach. Our proposed system combines [...] Read more.
Optimal investment decisions by institutional investors require accurate predictions with respect to the development of stock markets. Motivated by previous research that revealed the unsatisfactory performance of existing stock market prediction models, this study proposes a novel prediction approach. Our proposed system combines Artificial Intelligence (AI) with data from Virtual Investment Communities (VICs) and leverages VICs’ ability to support the process of predicting stock markets. An empirical study with two different models using real data shows the potential of the AI-based system with VICs information as an instrument for stock market predictions. VICs can be a valuable addition but our results indicate that this type of data is only helpful in certain market phases. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop