Algorithms

21 pages, 2222 KiB

Open AccessArticle

Adaptive and Lightweight Abnormal Node Detection via Biological Immune Game in Mobile Multimedia Networks

by Yajing Zhang, Kai Wang and Jinghui Zhang

Algorithms 2021, 14(12), 368; https://doi.org/10.3390/a14120368 - 20 Dec 2021

Cited by 2 | Viewed by 2431

Considering the contradiction between limited node resources and high detection costs in mobile multimedia networks, an adaptive and lightweight abnormal node detection algorithm based on artificial immunity and game theory is proposed in order to balance the trade-off between network security and detection [...] Read more.

Considering the contradiction between limited node resources and high detection costs in mobile multimedia networks, an adaptive and lightweight abnormal node detection algorithm based on artificial immunity and game theory is proposed in order to balance the trade-off between network security and detection overhead. The algorithm can adapt to the highly dynamic mobile multimedia networking environment with a large number of heterogeneous nodes and multi-source big data. Specifically, the heterogeneous problem of nodes is solved based on the non-specificity of an immune algorithm. A niche strategy is used to identify dangerous areas, and antibody division generates an antibody library that can be updated online, so as to realize the dynamic detection of the abnormal behavior of nodes. Moreover, the priority of node recovery for abnormal nodes is decided through a game between nodes without causing excessive resource consumption for security detection. The results of comparative experiments show that the proposed algorithm has a relatively high detection rate and a low false-positive rate, can effectively reduce consumption time, and has good level of adaptability under the condition of dynamic nodes. Full article

(This article belongs to the Special Issue Network Science: Algorithms and Applications)

► Show Figures

Figure 1

15 pages, 362 KiB

Open AccessArticle

A Pathfinding Problem for Fork-Join Directed Acyclic Graphs with Unknown Edge Length

by Kunihiko Hiraishi

Algorithms 2021, 14(12), 367; https://doi.org/10.3390/a14120367 - 17 Dec 2021

Viewed by 2258

Abstract

In a previous paper by the author, a pathfinding problem for directed trees is studied under the following situation: each edge has a nonnegative integer length, but the length is unknown in advance and should be found by a procedure whose computational cost [...] Read more.

In a previous paper by the author, a pathfinding problem for directed trees is studied under the following situation: each edge has a nonnegative integer length, but the length is unknown in advance and should be found by a procedure whose computational cost becomes exponentially larger as the length increases. In this paper, the same problem is studied for a more general class of graphs called fork-join directed acyclic graphs. The problem for the new class of graphs contains the previous one. In addition, the optimality criterion used in this paper is stronger than that in the previous paper and is more appropriate for real applications. Full article

► Show Figures

Figure 1

29 pages, 2397 KiB

Open AccessArticle

Subspace Detours Meet Gromov–Wasserstein

by Clément Bonet, Titouan Vayer, Nicolas Courty, François Septier and Lucas Drumetz

Algorithms 2021, 14(12), 366; https://doi.org/10.3390/a14120366 - 17 Dec 2021

Cited by 2 | Viewed by 3049

Abstract

In the context of optimal transport (OT) methods, the subspace detour approach was recently proposed by Muzellec and Cuturi. It consists of first finding an optimal plan between the measures projected on a wisely chosen subspace and then completing it in a nearly [...] Read more.

In the context of optimal transport (OT) methods, the subspace detour approach was recently proposed by Muzellec and Cuturi. It consists of first finding an optimal plan between the measures projected on a wisely chosen subspace and then completing it in a nearly optimal transport plan on the whole space. The contribution of this paper is to extend this category of methods to the Gromov–Wasserstein problem, which is a particular type of OT distance involving the specific geometry of each distribution. After deriving the associated formalism and properties, we give an experimental illustration on a shape matching problem. We also discuss a specific cost for which we can show connections with the Knothe–Rosenblatt rearrangement. Full article

(This article belongs to the Special Issue Optimal Transport: Algorithms and Applications)

► Show Figures

Figure 1

15 pages, 608 KiB

Open AccessArticle

A Branch-and-Bound Algorithm for Polymatrix Games ϵ-Proper Nash Equilibria Computation

by Slim Belhaiza

Algorithms 2021, 14(12), 365; https://doi.org/10.3390/a14120365 - 16 Dec 2021

Viewed by 2290

Abstract

When several Nash equilibria exist in the game, decision-makers need to refine their choices based on some refinement concepts. To this aim, the notion of a

ϵ

-proper equilibria set for polymatrix games is used to develop 0–1 mixed linear programs and compute [...] Read more.

When several Nash equilibria exist in the game, decision-makers need to refine their choices based on some refinement concepts. To this aim, the notion of a

ϵ

-proper equilibria set for polymatrix games is used to develop 0–1 mixed linear programs and compute

ϵ

-proper Nash equilibria. A Branch-and-Bound exact arithmetics algorithm is proposed. Experimental results are provided on polymatrix games randomly generated with different sizes and densities. Full article

(This article belongs to the Special Issue Algorithmic Game Theory 2021)

► Show Figures

Figure 1

33 pages, 5524 KiB

Open AccessReview

Metaheuristics in the Humanitarian Supply Chain

by Francisca Santana Robles, Eva Selene Hernández-Gress, Neil Hernández-Gress and Rafael Granillo Macias

Algorithms 2021, 14(12), 364; https://doi.org/10.3390/a14120364 - 15 Dec 2021

Cited by 4 | Viewed by 3704

Abstract

Everyday there are more disasters that require Humanitarian Supply Chain (HSC) attention; generally these problems are difficult to solve in reasonable computational time and metaheuristics (MHs) are the indicated solution algorithms. To our knowledge, there has not been a review article on MHs [...] Read more.

Everyday there are more disasters that require Humanitarian Supply Chain (HSC) attention; generally these problems are difficult to solve in reasonable computational time and metaheuristics (MHs) are the indicated solution algorithms. To our knowledge, there has not been a review article on MHs applied to HSC. In this work, 78 articles were extracted from 2016 publications using systematic literature review methodology and were analyzed to answer two research questions: (1) How are the HSC problems that have been solved from Metaheuristics classified? (2) What is the gap found to accomplish future research in Metaheuristics in HSC? After classifying them into deterministic (52.56%) and non-deterministic (47.44%) problems; post-disaster (51.28%), pre-disaster (14.10%) and integrated (34.62%); facility location (41.03%), distribution (71.79%), inventory (11.54%) and mass evacuation (10.26%); single (46.15%) and multiple objective functions (53.85%), single (76.92%) and multiple (23.07%) period; and the type of Metaheuristic: Metaphor (71.79%) with genetic algorithms and particle swarm optimization as the most used; and non-metaphor based (28.20%), in which search algorithms are mostly used; it is concluded that, to consider the uncertainty of the real context, future research should be done in non-deterministic and multi-period problems that integrate pre- and post-disaster stages, that increasingly include problems such as inventory and mass evacuation and in which new multi-objective MHs are tested. Full article

(This article belongs to the Special Issue Metaheuristic Algorithms in Optimization and Applications 2021)

► Show Figures

Figure 1

13 pages, 374 KiB

Open AccessArticle

Resource Allocation for Intelligent Reflecting Surfaces Assisted Federated Learning System with Imperfect CSI

by Wei Huang, Zhiren Han, Li Zhao, Hongbo Xu, Zhongnian Li and Ze Wang

Algorithms 2021, 14(12), 363; https://doi.org/10.3390/a14120363 - 14 Dec 2021

Cited by 4 | Viewed by 2890

Abstract

Due to its ability to significantly improve the wireless communication efficiency, the intelligent reflective surface (IRS) has aroused widespread research interest. However, it is a challenge to obtain perfect channel state information (CSI) for IRS-related channels due to the lack of the ability [...] Read more.

Due to its ability to significantly improve the wireless communication efficiency, the intelligent reflective surface (IRS) has aroused widespread research interest. However, it is a challenge to obtain perfect channel state information (CSI) for IRS-related channels due to the lack of the ability to send, receive, and process signals at IRS. Since most of the existing channel estimation methods are developed to obtain cascaded base station (BS)-IRS-user devices (UDs) channel, this paper studies the problem of computation and communication resource allocation of the IRS-assisted federated learning (FL) system based on the imperfect CSI. Specifically, we take the statistical CSI error model into consideration and formulate the training time minimization problem subject to the rate outage probability constraints. In order to solve this issue, the semi-definite relaxation (SDR) and the constrained concave convex procedure (CCCP) are invoked to transform it into a convex problem. Subsequently, a low-complexity algorithm is proposed to minimize the delay of the FL system. Numerical results show that the proposed algorithm effectively reduces the training time of the FL system base on imperfect CSI. Full article

(This article belongs to the Special Issue Algorithms for Communication Networks)

► Show Figures

Figure 1

24 pages, 1168 KiB

Open AccessArticle

Faster Provable Sieving Algorithms for the Shortest Vector Problem and the Closest Vector Problem on Lattices in ℓ_p Norm

by Priyanka Mukhopadhyay

Algorithms 2021, 14(12), 362; https://doi.org/10.3390/a14120362 - 13 Dec 2021

Cited by 3 | Viewed by 2920

Abstract

In this work, we give provable sieving algorithms for the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP) on lattices in

ℓ_{p}

norm (

1 \leq p \leq \infty

). The running time we obtain is better than existing provable [...] Read more.

In this work, we give provable sieving algorithms for the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP) on lattices in

ℓ_{p}

norm (

1 \leq p \leq \infty

). The running time we obtain is better than existing provable sieving algorithms. We give a new linear sieving procedure that works for all

ℓ_{p}

norm (

1 \leq p \leq \infty

). The main idea is to divide the space into hypercubes such that each vector can be mapped efficiently to a sub-region. We achieve a time complexity of

2^{2.751 n + o (n)}

, which is much less than the

2^{3.849 n + o (n)}

complexity of the previous best algorithm. We also introduce a mixed sieving procedure, where a point is mapped to a hypercube within a ball and then a quadratic sieve is performed within each hypercube. This improves the running time, especially in the

ℓ_{2}

norm, where we achieve a time complexity of

2^{2.25 n + o (n)}

, while the List Sieve Birthday algorithm has a running time of

2^{2.465 n + o (n)}

. We adopt our sieving techniques to approximation algorithms for SVP and CVP in

ℓ_{p}

norm (

1 \leq p \leq \infty

) and show that our algorithm has a running time of

2^{2.001 n + o (n)}

, while previous algorithms have a time complexity of

2^{3.169 n + o (n)}

. Full article

(This article belongs to the Section Randomized, Online, and Approximation Algorithms)

► Show Figures

Figure 1

14 pages, 743 KiB

Open AccessArticle

A Domain Adaptive Person Re-Identification Based on Dual Attention Mechanism and Camstyle Transfer

by Chengyan Zhong, Guanqiu Qi, Neal Mazur, Sarbani Banerjee, Devanshi Malaviya and Gang Hu

Algorithms 2021, 14(12), 361; https://doi.org/10.3390/a14120361 - 13 Dec 2021

Cited by 5 | Viewed by 2570

Abstract

Due to the variation in the image capturing process, the difference between source and target sets causes a challenge in unsupervised domain adaptation (UDA) on person re-identification (re-ID). Given a labeled source training set and an unlabeled target training set, this paper focuses [...] Read more.

Due to the variation in the image capturing process, the difference between source and target sets causes a challenge in unsupervised domain adaptation (UDA) on person re-identification (re-ID). Given a labeled source training set and an unlabeled target training set, this paper focuses on improving the generalization ability of the re-ID model on the target testing set. The proposed method enforces two properties at the same time: (1) camera invariance is achieved through the positive learning formed by unlabeled target images and their camera style transfer counterparts; and (2) the robustness of the backbone network feature extraction is improved, and the accuracy of feature extraction is enhanced by adding a position-channel dual attention mechanism. The proposed network model uses a classic dual-stream network. Comparative experimental results on three public benchmarks prove the superiority of the proposed method. Full article

(This article belongs to the Special Issue Deep Learning in Intelligent Video Surveillance)

► Show Figures

Graphical abstract

15 pages, 897 KiB

Open AccessArticle

Merging Discrete Morse Vector Fields: A Case of Stubborn Geometric Parallelization

by Douglas Lenseth and Boris Goldfarb

Algorithms 2021, 14(12), 360; https://doi.org/10.3390/a14120360 - 11 Dec 2021

Viewed by 2122

Abstract

We address the basic question in discrete Morse theory of combining discrete gradient fields that are partially defined on subsets of the given complex. This is a well-posed question when the discrete gradient field V is generated using a fixed algorithm which has [...] Read more.

We address the basic question in discrete Morse theory of combining discrete gradient fields that are partially defined on subsets of the given complex. This is a well-posed question when the discrete gradient field V is generated using a fixed algorithm which has a local nature. One example is ProcessLowerStars, a widely used algorithm for computing persistent homology associated to a grey-scale image in 2D or 3D. While the algorithm for V may be inherently local, being computed within stars of vertices and so embarrassingly parallelizable, in practical use, it is natural to want to distribute the computation over patches

P_{i}

, apply the chosen algorithm to compute the fields

V_{i}

associated to each patch, and then assemble the ambient field V from these. Simply merging the fields from the patches, even when that makes sense, gives a wrong answer. We develop both very general merging procedures and leaner versions designed for specific, easy-to-arrange covering patterns. Full article

(This article belongs to the Special Issue Distributed Algorithms and Applications)

► Show Figures

Figure 1

11 pages, 326 KiB

Open AccessArticle

Lempel-Ziv Parsing for Sequences of Blocks

by Dmitry Kosolobov and Daniel Valenzuela

Algorithms 2021, 14(12), 359; https://doi.org/10.3390/a14120359 - 10 Dec 2021

Viewed by 2638

Abstract

The Lempel-Ziv parsing (LZ77) is a widely popular construction lying at the heart of many compression algorithms. These algorithms usually treat the data as a sequence of bytes, i.e., blocks of fixed length 8. Another common option is to view the data as [...] Read more.

The Lempel-Ziv parsing (LZ77) is a widely popular construction lying at the heart of many compression algorithms. These algorithms usually treat the data as a sequence of bytes, i.e., blocks of fixed length 8. Another common option is to view the data as a sequence of bits. We investigate the following natural question: what is the relationship between the LZ77 parsings of the same data interpreted as a sequence of fixed-length blocks and as a sequence of bits (or other “elementary” letters)? In this paper, we prove that, for any integer

b > 1

, the number z of phrases in the LZ77 parsing of a string of length n and the number

z_{b}

of phrases in the LZ77 parsing of the same string in which blocks of length b are interpreted as separate letters (e.g.,

b = 8

in case of bytes) are related as

z_{b} = O (b z log \frac{n}{z})

. The bound holds for both “overlapping” and “non-overlapping” versions of LZ77. Further, we establish a tight bound

z_{b} = O (b z)

for the special case when each phrase in the LZ77 parsing of the string has a “phrase-aligned” earlier occurrence (an occurrence equal to the concatenation of consecutive phrases). The latter is an important particular case of parsing produced, for instance, by grammar-based compression methods. Full article

(This article belongs to the Section Analysis of Algorithms and Complexity Theory)

► Show Figures

Figure 1

22 pages, 7472 KiB

Open AccessArticle

Agent State Flipping Based Hybridization of Heuristic Optimization Algorithms: A Case of Bat Algorithm and Krill Herd Hybrid Algorithm

by Robertas Damaševičius and Rytis Maskeliūnas

Algorithms 2021, 14(12), 358; https://doi.org/10.3390/a14120358 - 10 Dec 2021

Cited by 11 | Viewed by 3028

Abstract

This paper describes a unique meta-heuristic technique for hybridizing bio-inspired heuristic algorithms. The technique is based on altering the state of agents using a logistic probability function that is dependent on an agent’s fitness rank. An evaluation using two bio-inspired algorithms (bat algorithm [...] Read more.

This paper describes a unique meta-heuristic technique for hybridizing bio-inspired heuristic algorithms. The technique is based on altering the state of agents using a logistic probability function that is dependent on an agent’s fitness rank. An evaluation using two bio-inspired algorithms (bat algorithm (BA) and krill herd (KH)) and 12 optimization problems (cross-in-tray, rotated hyper-ellipsoid (RHE), sphere, sum of squares, sum of different powers, McCormick, Zakharov, Rosenbrock, De Jong No. 5, Easom, Branin, and Styblinski–Tang) is presented. Furthermore, an experimental evaluation of the proposed scheme using the industrial three-bar truss design problem is presented. The experimental results demonstrate that the hybrid scheme outperformed the baseline algorithms (mean rank for the hybrid BA-KH algorithm is 1.279 vs. 1.958 for KH and 2.763 for BA). Full article

► Show Figures

Figure 1

3 pages, 166 KiB

Open AccessEditorial

Special Issue “2021 Selected Papers from Algorithms’ Editorial Board Members”

by Frank Werner

Algorithms 2021, 14(12), 357; https://doi.org/10.3390/a14120357 - 9 Dec 2021

Cited by 1 | Viewed by 2196

Abstract

This is the second edition of a special issue of Algorithms that is of a rather different nature compared to other Special Issues in the journal, which are usually dedicated to a particular subject in the area of algorithms [...] Full article

(This article belongs to the Special Issue 2021 Selected Papers from Algorithms Editorial Board Members)

17 pages, 1302 KiB

Open AccessArticle

Optimized Weighted Nearest Neighbours Matching Algorithm for Control Group Selection

by Szabolcs Szekér and Ágnes Vathy-Fogarassy

Algorithms 2021, 14(12), 356; https://doi.org/10.3390/a14120356 - 8 Dec 2021

Viewed by 3792

Abstract

An essential criterion for the proper implementation of case-control studies is selecting appropriate case and control groups. In this article, a new simulated annealing-based control group selection method is proposed, which solves the problem of selecting individuals in the control group as a [...] Read more.

An essential criterion for the proper implementation of case-control studies is selecting appropriate case and control groups. In this article, a new simulated annealing-based control group selection method is proposed, which solves the problem of selecting individuals in the control group as a distance optimization task. The proposed algorithm pairs the individuals in the n-dimensional feature space by minimizing the weighted distances between them. The weights of the dimensions are based on the odds ratios calculated from the logistic regression model fitted on the variables describing the probability of membership of the treated group. For finding the optimal pairing of the individuals, simulated annealing is utilized. The effectiveness of the newly proposed Weighted Nearest Neighbours Control Group Selection with Simulated Annealing (WNNSA) algorithm is presented by two Monte Carlo studies. Results show that the WNNSA method can outperform the widely applied greedy propensity score matching method in feature spaces where only a few covariates characterize individuals and the covariates can only take a few values. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Graphical abstract

10 pages, 265 KiB

Open AccessArticle

A Meeting Point of Probability, Graphs, and Algorithms: The Lovász Local Lemma and Related Results—A Survey

by András Faragó

Algorithms 2021, 14(12), 355; https://doi.org/10.3390/a14120355 - 8 Dec 2021

Cited by 2 | Viewed by 3443

Abstract

A classic and fundamental result, known as the Lovász Local Lemma, is a gem in the probabilistic method of combinatorics. At a high level, its core message can be described by the claim that weakly dependent events behave similarly to independent ones. A [...] Read more.

A classic and fundamental result, known as the Lovász Local Lemma, is a gem in the probabilistic method of combinatorics. At a high level, its core message can be described by the claim that weakly dependent events behave similarly to independent ones. A fascinating feature of this result is that even though it is a purely probabilistic statement, it provides a valuable and versatile tool for proving completely deterministic theorems. The Lovász Local Lemma has found many applications; despite being originally published in 1973, it still attracts active novel research. In this survey paper, we review various forms of the Lemma, as well as some related results and applications. Full article

(This article belongs to the Special Issue Surveys in Algorithm Analysis and Complexity Theory)

19 pages, 1318 KiB

Open AccessArticle

A Model-Driven Approach for Solving the Software Component Allocation Problem

by Issam Al-Azzoni, Julian Blank and Nenad Petrović

Algorithms 2021, 14(12), 354; https://doi.org/10.3390/a14120354 - 6 Dec 2021

Cited by 4 | Viewed by 3100

Abstract

The underlying infrastructure paradigms behind the novel usage scenarios and services are becoming increasingly complex—from everyday life in smart cities to industrial environments. Both the number of devices involved and their heterogeneity make the allocation of software components quite challenging. Despite the enormous [...] Read more.

The underlying infrastructure paradigms behind the novel usage scenarios and services are becoming increasingly complex—from everyday life in smart cities to industrial environments. Both the number of devices involved and their heterogeneity make the allocation of software components quite challenging. Despite the enormous flexibility enabled by component-based software engineering, finding the optimal allocation of software artifacts to the pool of available devices and computation units could bring many benefits, such as improved quality of service (QoS), reduced energy consumption, reduction of costs, and many others. Therefore, in this paper, we introduce a model-based framework that aims to solve the software component allocation problem (CAP). We formulate it as an optimization problem with either single or multiple objective functions and cover both cases in the proposed framework. Additionally, our framework also provides visualization and comparison of the optimal solutions in the case of multi-objective component allocation. The main contributions introduced in this paper are: (1) a novel methodology for tackling CAP-alike problems based on the usage of model-driven engineering (MDE) for both problem definition and solution representation; (2) a set of Python tools that enable the workflow starting from the CAP model interpretation, after that the generation of optimal allocations and, finally, result visualization. The proposed framework is compared to other similar works using either linear optimization, genetic algorithm (GA), and ant colony optimization (ACO) algorithm within the experiments based on notable papers on this topic, covering various usage scenarios—from Cloud and Fog computing infrastructure management to embedded systems, robotics, and telecommunications. According to the achieved results, our framework performs much faster than GA and ACO-based solutions. Apart from various benefits of adopting a multi-objective approach in many cases, it also shows significant speedup compared to frameworks leveraging single-objective linear optimization, especially in the case of larger problem models. Full article

► Show Figures

Figure 1

23 pages, 4431 KiB

Open AccessArticle

Hexadecimal Aggregate Approximation Representation and Classification of Time Series Data

by Zhenwen He, Chunfeng Zhang, Xiaogang Ma and Gang Liu

Algorithms 2021, 14(12), 353; https://doi.org/10.3390/a14120353 - 2 Dec 2021

Cited by 3 | Viewed by 3107

Abstract

Time series data are widely found in finance, health, environmental, social, mobile and other fields. A large amount of time series data has been produced due to the general use of smartphones, various sensors, RFID and other internet devices. How a time series [...] Read more.

Time series data are widely found in finance, health, environmental, social, mobile and other fields. A large amount of time series data has been produced due to the general use of smartphones, various sensors, RFID and other internet devices. How a time series is represented is key to the efficient and effective storage and management of time series data, as well as being very important to time series classification. Two new time series representation methods, Hexadecimal Aggregate approXimation (HAX) and Point Aggregate approXimation (PAX), are proposed in this paper. The two methods represent each segment of a time series as a transformable interval object (TIO). Then, each TIO is mapped to a spatial point located on a two-dimensional plane. Finally, the HAX maps each point to a hexadecimal digit so that a time series is converted into a hex string. The experimental results show that HAX has higher classification accuracy than Symbolic Aggregate approXimation (SAX) but a lower one than some SAX variants (SAX-TD, SAX-BD). The HAX has the same space cost as SAX but is lower than these variants. The PAX has higher classification accuracy than HAX and is extremely close to the Euclidean distance (ED) measurement; however, the space cost of PAX is generally much lower than the space cost of ED. HAX and PAX are general representation methods that can also support geoscience time series clustering, indexing and query except for classification. Full article

► Show Figures

Figure 1

14 pages, 3068 KiB

Open AccessArticle

A Sequential Graph Neural Network for Short Text Classification

by Ke Zhao, Lan Huang, Rui Song, Qiang Shen and Hao Xu

Algorithms 2021, 14(12), 352; https://doi.org/10.3390/a14120352 - 1 Dec 2021

Cited by 10 | Viewed by 4406

Abstract

Short text classification is an important problem of natural language processing (NLP), and graph neural networks (GNNs) have been successfully used to solve different NLP problems. However, few studies employ GNN for short text classification, and most of the existing graph-based models ignore [...] Read more.

Short text classification is an important problem of natural language processing (NLP), and graph neural networks (GNNs) have been successfully used to solve different NLP problems. However, few studies employ GNN for short text classification, and most of the existing graph-based models ignore sequential information (e.g., word orders) in each document. In this work, we propose an improved sequence-based feature propagation scheme, which fully uses word representation and document-level word interaction and overcomes the limitations of textual features in short texts. On this basis, we utilize this propagation scheme to construct a lightweight model, sequential GNN (SGNN), and its extended model, ESGNN. Specifically, we build individual graphs for each document in the short text corpus based on word co-occurrence and use a bidirectional long short-term memory network (Bi-LSTM) to extract the sequential features of each document; therefore, word nodes in the document graph retain contextual information. Furthermore, two different simplified graph convolutional networks (GCNs) are used to learn word representations based on their local structures. Finally, word nodes combined with sequential information and local information are incorporated as the document representation. Extensive experiments on seven benchmark datasets demonstrate the effectiveness of our method. Full article

► Show Figures

Figure 1

20 pages, 1250 KiB

Open AccessArticle

Locally Scaled and Stochastic Volatility Metropolis– Hastings Algorithms

by Wilson Tsakane Mongwe, Rendani Mbuvha and Tshilidzi Marwala

Algorithms 2021, 14(12), 351; https://doi.org/10.3390/a14120351 - 30 Nov 2021

Cited by 4 | Viewed by 2835

Abstract

Markov chain Monte Carlo (MCMC) techniques are usually used to infer model parameters when closed-form inference is not feasible, with one of the simplest MCMC methods being the random walk Metropolis–Hastings (MH) algorithm. The MH algorithm suffers from random walk behaviour, which results [...] Read more.

Markov chain Monte Carlo (MCMC) techniques are usually used to infer model parameters when closed-form inference is not feasible, with one of the simplest MCMC methods being the random walk Metropolis–Hastings (MH) algorithm. The MH algorithm suffers from random walk behaviour, which results in inefficient exploration of the target posterior distribution. This method has been improved upon, with algorithms such as Metropolis Adjusted Langevin Monte Carlo (MALA) and Hamiltonian Monte Carlo being examples of popular modifications to MH. In this work, we revisit the MH algorithm to reduce the autocorrelations in the generated samples without adding significant computational time. We present the: (1) Stochastic Volatility Metropolis–Hastings (SVMH) algorithm, which is based on using a random scaling matrix in the MH algorithm, and (2) Locally Scaled Metropolis–Hastings (LSMH) algorithm, in which the scaled matrix depends on the local geometry of the target distribution. For both these algorithms, the proposal distribution is still Gaussian centred at the current state. The empirical results show that these minor additions to the MH algorithm significantly improve the effective sample rates and predictive performance over the vanilla MH method. The SVMH algorithm produces similar effective sample sizes to the LSMH method, with SVMH outperforming LSMH on an execution time normalised effective sample size basis. The performance of the proposed methods is also compared to the MALA and the current state-of-art method being the No-U-Turn sampler (NUTS). The analysis is performed using a simulation study based on Neal’s funnel and multivariate Gaussian distributions and using real world data modeled using jump diffusion processes and Bayesian logistic regression. Although both MALA and NUTS outperform the proposed algorithms on an effective sample size basis, the SVMH algorithm has similar or better predictive performance when compared to MALA and NUTS across the various targets. In addition, the SVMH algorithm outperforms the other MCMC algorithms on a normalised effective sample size basis on the jump diffusion processes datasets. These results indicate the overall usefulness of the proposed algorithms. Full article

(This article belongs to the Special Issue Monte Carlo Methods and Algorithms)

► Show Figures

Figure 1

16 pages, 1096 KiB

Open AccessArticle

The Buy-Online-Pick-Up-in-Store Retailing Model: Optimization Strategies for In-Store Picking and Packing

by Nicola Ognibene Pietri, Xiaochen Chou, Dominic Loske, Matthias Klumpp and Roberto Montemanni

Algorithms 2021, 14(12), 350; https://doi.org/10.3390/a14120350 - 30 Nov 2021

Cited by 11 | Viewed by 4370

Abstract

Online shopping is growing fast due to the increasingly widespread use of digital services. During the COVID-19 pandemic, the desire for contactless shopping has further changed consumer behavior and accelerated the acceptance of online grocery purchases. Consequently, traditional brick-and-mortar retailers are developing omnichannel [...] Read more.

Online shopping is growing fast due to the increasingly widespread use of digital services. During the COVID-19 pandemic, the desire for contactless shopping has further changed consumer behavior and accelerated the acceptance of online grocery purchases. Consequently, traditional brick-and-mortar retailers are developing omnichannel solutions such as click-and-collect services to fulfill the increasing demand. In this work, we consider the Buy-Online-Pick-up-in-Store concept, in which online orders are collected by employees of the conventional stores. As labor is a major cost driver, we apply and discuss different optimizing strategies in the picking and packing process based on real-world data from a German retailer. With comparison of different methods, we estimate the improvements in efficiency in terms of time spent during the picking process. Additionally, the time spent on the packing process can be further decreased by applying a mathematical model that guides the employees on how to organize the articles in different shopping bags during the picking process. In general, we put forward effective strategies for the Buy-Online-Pick-up-in-Store paradigm that can be easily implemented by stores with different topologies. Full article

(This article belongs to the Special Issue Mathematical Models and Their Applications III)

► Show Figures

Figure 1

15 pages, 5466 KiB

Open AccessArticle

GW-DC: A Deep Clustering Model Leveraging Two-Dimensional Image Transformation and Enhancement

by Xutong Li, Taoying Li and Yan Wang

Algorithms 2021, 14(12), 349; https://doi.org/10.3390/a14120349 - 29 Nov 2021

Cited by 2 | Viewed by 2994

Abstract

Traditional time-series clustering methods usually perform poorly on high-dimensional data. However, image clustering using deep learning methods can complete image annotation and searches in large image databases well. Therefore, this study aimed to propose a deep clustering model named GW_DC to convert one-dimensional [...] Read more.

Traditional time-series clustering methods usually perform poorly on high-dimensional data. However, image clustering using deep learning methods can complete image annotation and searches in large image databases well. Therefore, this study aimed to propose a deep clustering model named GW_DC to convert one-dimensional time-series into two-dimensional images and improve cluster performance for algorithm users. The proposed GW_DC consisted of three processing stages: the image conversion stage, image enhancement stage, and image clustering stage. In the image conversion stage, the time series were converted into four kinds of two-dimensional images by different algorithms, including grayscale images, recurrence plot images, Markov transition field images, and Gramian Angular Difference Field images; this last one was considered to be the best by comparison. In the image enhancement stage, the signal components of two-dimensional images were extracted and processed by wavelet transform to denoise and enhance texture features. Meanwhile, a deep clustering network, combining convolutional neural networks with K-Means, was designed for well-learning characteristics and clustering according to the aforementioned enhanced images. Finally, six UCR datasets were adopted to assess the performance of models. The results showed that the proposed GW_DC model provided better results. Full article

► Show Figures

Figure 1

21 pages, 3667 KiB

Open AccessArticle

Robust Representation and Efficient Feature Selection Allows for Effective Clustering of SARS-CoV-2 Variants

by Zahra Tayebi, Sarwan Ali and Murray Patterson

Algorithms 2021, 14(12), 348; https://doi.org/10.3390/a14120348 - 29 Nov 2021

Cited by 18 | Viewed by 3074

Abstract

The widespread availability of large amounts of genomic data on the SARS-CoV-2 virus, as a result of the COVID-19 pandemic, has created an opportunity for researchers to analyze the disease at a level of detail, unlike any virus before it. On the one [...] Read more.

The widespread availability of large amounts of genomic data on the SARS-CoV-2 virus, as a result of the COVID-19 pandemic, has created an opportunity for researchers to analyze the disease at a level of detail, unlike any virus before it. On the one hand, this will help biologists, policymakers, and other authorities to make timely and appropriate decisions to control the spread of the coronavirus. On the other hand, such studies will help to more effectively deal with any possible future pandemic. Since the SARS-CoV-2 virus contains different variants, each of them having different mutations, performing any analysis on such data becomes a difficult task, given the size of the data. It is well known that much of the variation in the SARS-CoV-2 genome happens disproportionately in the spike region of the genome sequence—the relatively short region which codes for the spike protein(s). In this paper, we propose a robust feature-vector representation of biological sequences that, when combined with the appropriate feature selection method, allows different downstream clustering approaches to perform well on a variety of different measures. We use such proposed approach with an array of clustering techniques to cluster spike protein sequences in order to study the behavior of different known variants that are increasing at a very high rate throughout the world. We use a k-mers based approach first to generate a fixed-length feature vector representation of the spike sequences. We then show that we can efficiently and effectively cluster the spike sequences based on the different variants with the appropriate feature selection. Using a publicly available set of SARS-CoV-2 spike sequences, we perform clustering of these sequences using both hard and soft clustering methods and show that, with our feature selection methods, we can achieve higher

F_{1}

scores for the clusters and also better clustering quality metrics compared to baselines. Full article

(This article belongs to the Special Issue Explainable Artificial Intelligence in Bioinformatic)

► Show Figures

Figure 1

20 pages, 516 KiB

Open AccessArticle

Computing the Atom Graph of a Graph and the Union Join Graph of a Hypergraph

by Anne Berry and Geneviève Simonet

Algorithms 2021, 14(12), 347; https://doi.org/10.3390/a14120347 - 28 Nov 2021

Cited by 1 | Viewed by 2363

Abstract

The atom graph of a graph is a graph whose vertices are the atoms obtained by clique minimal separator decomposition of this graph, and whose edges are the edges of all possible atom trees of this graph. We provide two efficient algorithms for [...] Read more.

The atom graph of a graph is a graph whose vertices are the atoms obtained by clique minimal separator decomposition of this graph, and whose edges are the edges of all possible atom trees of this graph. We provide two efficient algorithms for computing this atom graph, with a complexity in

O (m i n (n^{ω} log n, n m, n (n + \bar{m}))

time, where n is the number of vertices of G, m is the number of its edges,

\bar{m}

is the number of edges of the complement of G, and

ω

, also denoted by

α

in the literature, is a real number, such that

O (n^{ω})

is the best known time complexity for matrix multiplication, whose current value is 2,3728596. This time complexity is no more than the time complexity of computing the atoms in the general case. We extend our results to

α

-acyclic hypergraphs, which are hypergraphs having at least one join tree, a join tree of an hypergraph being defined by its hyperedges in the same way as an atom tree of a graph is defined by its atoms. We introduce the notion of union join graph, which is the union of all possible join trees; we apply our algorithms for atom graphs to efficiently compute union join graphs. Full article

(This article belongs to the Special Issue Optimization Algorithms for Graphs and Complex Networks)

► Show Figures

Figure 1

15 pages, 342 KiB

Open AccessArticle

A Procedure for Factoring and Solving Nonlocal Boundary Value Problems for a Type of Linear Integro-Differential Equations

by Efthimios Providas and Ioannis Nestorios Parasidis

Algorithms 2021, 14(12), 346; https://doi.org/10.3390/a14120346 - 28 Nov 2021

Cited by 3 | Viewed by 2588

Abstract

The aim of this article is to present a procedure for the factorization and exact solution of boundary value problems for a class of n-th order linear Fredholm integro-differential equations with multipoint and integral boundary conditions. We use the theory of the [...] Read more.

The aim of this article is to present a procedure for the factorization and exact solution of boundary value problems for a class of n-th order linear Fredholm integro-differential equations with multipoint and integral boundary conditions. We use the theory of the extensions of linear operators in Banach spaces and establish conditions for the decomposition of the integro-differential operator into two lower-order integro-differential operators. We also create solvability criteria and derive the unique solution in closed form. Two example problems for an ordinary and a partial intergro-differential equation respectively are solved. Full article

18 pages, 16996 KiB

Open AccessArticle

Compensating Data Shortages in Manufacturing with Monotonicity Knowledge

by Martin von Kurnatowski, Jochen Schmid, Patrick Link, Rebekka Zache, Lukas Morand, Torsten Kraft, Ingo Schmidt, Jan Schwientek and Anke Stoll

Algorithms 2021, 14(12), 345; https://doi.org/10.3390/a14120345 - 27 Nov 2021

Cited by 7 | Viewed by 2992

Abstract

Systematic decision making in engineering requires appropriate models. In this article, we introduce a regression method for enhancing the predictive power of a model by exploiting expert knowledge in the form of shape constraints, or more specifically, monotonicity constraints. Incorporating such information is [...] Read more.

Systematic decision making in engineering requires appropriate models. In this article, we introduce a regression method for enhancing the predictive power of a model by exploiting expert knowledge in the form of shape constraints, or more specifically, monotonicity constraints. Incorporating such information is particularly useful when the available datasets are small or do not cover the entire input space, as is often the case in manufacturing applications. We set up the regression subject to the considered monotonicity constraints as a semi-infinite optimization problem, and propose an adaptive solution algorithm. The method is applicable in multiple dimensions and can be extended to more general shape constraints. It was tested and validated on two real-world manufacturing processes, namely, laser glass bending and press hardening of sheet metal. It was found that the resulting models both complied well with the expert’s monotonicity knowledge and predicted the training data accurately. The suggested approach led to lower root-mean-squared errors than comparative methods from the literature for the sparse datasets considered in this work. Full article

(This article belongs to the Special Issue Optimization Algorithms and Applications at OLA 2021)

► Show Figures

Figure 1

28 pages, 2819 KiB

Open AccessArticle

A Visual Mining Approach to Improved Multiple- Instance Learning

by Sonia Castelo, Moacir Ponti and Rosane Minghim

Algorithms 2021, 14(12), 344; https://doi.org/10.3390/a14120344 - 27 Nov 2021

Viewed by 2583

Abstract

Multiple-instance learning (MIL) is a paradigm of machine learning that aims to classify a set (bag) of objects (instances), assigning labels only to the bags. This problem is often addressed by selecting an instance to represent each bag, [...] Read more.

Multiple-instance learning (MIL) is a paradigm of machine learning that aims to classify a set (bag) of objects (instances), assigning labels only to the bags. This problem is often addressed by selecting an instance to represent each bag, transforming an MIL problem into standard supervised learning. Visualization can be a useful tool to assess learning scenarios by incorporating the users’ knowledge into the classification process. Considering that multiple-instance learning is a paradigm that cannot be handled by current visualization techniques, we propose a multiscale tree-based visualization called MILTree to support MIL problems. The first level of the tree represents the bags, and the second level represents the instances belonging to each bag, allowing users to understand the MIL datasets in an intuitive way. In addition, we propose two new instance selection methods for MIL, which help users improve the model even further. Our methods can handle both binary and multiclass scenarios. In our experiments, SVM was used to build the classifiers. With support of the MILTree layout, the initial classification model was updated by changing the training set, which is composed of the prototype instances. Experimental results validate the effectiveness of our approach, showing that visual mining by MILTree can support exploring and improving models in MIL scenarios and that our instance selection methods outperform the currently available alternatives in most cases. Full article

(This article belongs to the Special Issue New Algorithms for Visual Data Mining)

► Show Figures

Graphical abstract

13 pages, 18548 KiB

Open AccessArticle

Algorithmic Design of an FPGA-Based Calculator for Fast Evaluation of Tsunami Wave Danger

by Mikhail Lavrentiev, Konstantin Lysakov, Andrey Marchuk, Konstantin Oblaukhov and Mikhail Shadrin

Algorithms 2021, 14(12), 343; https://doi.org/10.3390/a14120343 - 26 Nov 2021

Cited by 3 | Viewed by 2767

Abstract

Events of a seismic nature followed by catastrophic floods caused by tsunami waves (the incidence of which has increased in recent decades) have an important impact on the populations of littoral regions. On the coast of Japan and Kamchatka, it takes nearly 20 [...] Read more.

Events of a seismic nature followed by catastrophic floods caused by tsunami waves (the incidence of which has increased in recent decades) have an important impact on the populations of littoral regions. On the coast of Japan and Kamchatka, it takes nearly 20 min for tsunami waves to approach the nearest dry land after an offshore seismic event. This paper addresses an important question of fast simulation of tsunami wave propagation by mapping the algorithms in use in field-programmable gate arrays (FPGAs) with the help of high-level synthesis (HLS). Wave propagation is described by the shallow water system, and for numerical treatment the MacCormack scheme is used. The MacCormack algorithm is a direct difference scheme at a three-point stencil of a “cross” type; it happens to be appropriate for FPGA-based parallel implementation. A specialized calculator was designed. The developed software was tested for precision and performance. Numerical tests computing wave fronts show very good agreement with the available exact solutions (for two particular cases of the sea bed topography) and with the reference code. As the result, it takes just 17.06 s to simulate 1600 s (3200 time steps) of the wave propagation using a 3000 × 3200 computation grid with a VC709 board. The step length of the computational grid was chosen to display the simulation results in sufficient detail along the coastline. At the same time, the size of data arrays should provide their free placement in the memory of FPGA chips. The rather high performance achieved shows that tsunami danger could be correctly evaluated in a few minutes after seismic events. Full article

(This article belongs to the Special Issue Algorithms in Reconfigurable Computing)

► Show Figures

Figure 1

21 pages, 811 KiB

Open AccessArticle

An O(log₂N) Fully-Balanced Resampling Algorithm for Particle Filters on Distributed Memory Architectures

by Alessandro Varsi, Simon Maskell and Paul G. Spirakis

Algorithms 2021, 14(12), 342; https://doi.org/10.3390/a14120342 - 26 Nov 2021

Cited by 8 | Viewed by 4077

Abstract

Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly [...] Read more.

Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly slow. Parallel computing can help to address this. However, resampling (and, hence, PFs as well) necessarily involves a bottleneck, the redistribution step, which is notoriously challenging to parallelize if using textbook parallel computing techniques. A state-of-the-art redistribution takes

O ({({log}_{2} N)}^{2})

computations on Distributed Memory (DM) architectures, which most supercomputers adopt, whereas redistribution can be performed in

O ({log}_{2} N)

on Shared Memory (SM) architectures, such as GPU or mainstream CPUs. In this paper, we propose a novel parallel redistribution for DM that achieves an

O ({log}_{2} N)

time complexity. We also present empirical results that indicate that our novel approach outperforms the

O ({({log}_{2} N)}^{2})

approach. Full article

(This article belongs to the Collection Parallel and Distributed Computing: Algorithms and Applications)

► Show Figures

Figure 1

16 pages, 3090 KiB

Open AccessArticle

A Blockchain-Based Audit Trail Mechanism: Design and Implementation

by Cristina Regueiro, Iñaki Seco, Iván Gutiérrez-Agüero, Borja Urquizu and Jason Mansell

Algorithms 2021, 14(12), 341; https://doi.org/10.3390/a14120341 - 26 Nov 2021

Cited by 11 | Viewed by 6990

Abstract

Audit logs are a critical component in today’s enterprise business systems as they provide several benefits such as records transparency and integrity and security of sensitive information by creating a layer of evidential support. However, current implementations are vulnerable to attacks on data [...] Read more.

Audit logs are a critical component in today’s enterprise business systems as they provide several benefits such as records transparency and integrity and security of sensitive information by creating a layer of evidential support. However, current implementations are vulnerable to attacks on data integrity or availability. This paper presents a Blockchain-based audit trail mechanism that leverages the security features of Blockchain to enable secure and reliable audit trails and to address the aforementioned vulnerabilities. The architecture design and specific implementation are described in detail, resulting in a real prototype of a reliable, secure, and user-friendly audit trail mechanism. Full article

(This article belongs to the Special Issue Advances in Blockchain Architecture and Consensus)

► Show Figures

Figure 1

21 pages, 9131 KiB

Open AccessReview

Overview of Algorithms for Using Particle Morphology in Pre-Detonation Nuclear Forensics

by Tom Burr, Ian Schwerdt, Kari Sentz, Luther McDonald and Marianne Wilkerson

Algorithms 2021, 14(12), 340; https://doi.org/10.3390/a14120340 - 24 Nov 2021

Cited by 5 | Viewed by 3130

Abstract

A major goal in pre-detonation nuclear forensics is to infer the processing conditions and/or facility type that produced radiological material. This review paper focuses on analyses of particle size, shape, texture (“morphology”) signatures that could provide information on the provenance of interdicted materials. [...] Read more.

A major goal in pre-detonation nuclear forensics is to infer the processing conditions and/or facility type that produced radiological material. This review paper focuses on analyses of particle size, shape, texture (“morphology”) signatures that could provide information on the provenance of interdicted materials. For example, uranium ore concentrates (UOC or yellowcake) include ammonium diuranate (ADU), ammonium uranyl carbonate (AUC), sodium diuranate (SDU), magnesium diuranate (MDU), and others, each prepared using different salts to precipitate U from solution. Once precipitated, UOCs are often dried and calcined to remove adsorbed water. The products can be allowed to react further, forming uranium oxides UO3, U3O8, or UO2 powders, whose surface morphology can be indicative of precipitation and/or calcination conditions used in their production. This review paper describes statistical issues and approaches in using quantitative analyses of measurements such as particle size and shape to infer production conditions. Statistical topics include multivariate t tests (Hotelling’s

T^{2}

), design of experiments, and several machine learning (ML) options including decision trees, learning vector quantization neural networks, mixture discriminant analysis, and approximate Bayesian computation (ABC). ABC is emphasized as an attractive option to include the effects of model uncertainty in the selected and fitted forward model used for inferring processing conditions. Full article

(This article belongs to the Special Issue 2021 Selected Papers from Algorithms Editorial Board Members)

► Show Figures

Figure 1

22 pages, 459 KiB

Open AccessArticle

A Rule Extraction Technique Applied to Ensembles of Neural Networks, Random Forests, and Gradient-Boosted Trees

by Guido Bologna

Algorithms 2021, 14(12), 339; https://doi.org/10.3390/a14120339 - 23 Nov 2021

Cited by 12 | Viewed by 3234

Abstract

In machine learning, ensembles of models based on Multi-Layer Perceptrons (MLPs) or decision trees are considered successful models. However, explaining their responses is a complex problem that requires the creation of new methods of interpretation. A natural way to explain the classifications of [...] Read more.

In machine learning, ensembles of models based on Multi-Layer Perceptrons (MLPs) or decision trees are considered successful models. However, explaining their responses is a complex problem that requires the creation of new methods of interpretation. A natural way to explain the classifications of the models is to transform them into propositional rules. In this work, we focus on random forests and gradient-boosted trees. Specifically, these models are converted into an ensemble of interpretable MLPs from which propositional rules are produced. The rule extraction method presented here allows one to precisely locate the discriminating hyperplanes that constitute the antecedents of the rules. In experiments based on eight classification problems, we compared our rule extraction technique to “Skope-Rules” and other state-of-the-art techniques. Experiments were performed with ten-fold cross-validation trials, with propositional rules that were also generated from ensembles of interpretable MLPs. By evaluating the characteristics of the extracted rules in terms of complexity, fidelity, and accuracy, the results obtained showed that our rule extraction technique is competitive. To the best of our knowledge, this is one of the few works showing a rule extraction technique that has been applied to both ensembles of decision trees and neural networks. Full article

(This article belongs to the Special Issue Ensemble Algorithms and/or Explainability)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Algorithms, Volume 14, Issue 12 (December 2021) – 30 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI