Uncertainty Quantification Techniques in Statistics

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Computational and Applied Mathematics".

Deadline for manuscript submissions: closed (31 December 2020) | Viewed by 21369

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editor


E-Mail Website
Guest Editor
Statistics Discipline, Division of Science and Mathematics, University of Minnesota at Morris, Morris, MN 56267, USA
Interests: probability and stochastic processes; Functional Data Analysis; financial time series
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Uncertainty Quantification (UQ) is a mainstream research topic in applied mathematics and statistics. To identify UQ problems, diverse modern techniques for large and complex data analysis have been developed in applied mathematics, computer science, and statistics.

To promote these modern data analysis methods in biology, economics, environmental studies, finance, mathematics, operational research, science, and statistics, a Special Issue of Mathematics (ISSN 2227-7390), the Science Citation Index Expanded (SCIE) Journal, will be devoted to “Uncertainty Quantification Techniques in Statistics”.

The Guest Editor for this Special Issue is Prof. Dr. Jong‐Min Kim.

Prof. Dr. Jong-Min Kim
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Bayesian statistics
  • Change-point detection
  • Computer model
  • Financial time series
  • Functional data analysis
  • Machine learning
  • Quality control
  • Spatial statistics

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 377 KiB  
Article
Comparing Groups of Decision-Making Units in Efficiency Based on Semiparametric Regression
by Hohsuk Noh and Seong J. Yang
Mathematics 2020, 8(2), 233; https://doi.org/10.3390/math8020233 - 11 Feb 2020
Viewed by 2078
Abstract
We consider a stochastic frontier model in which a deviation of output from the production frontier consists of two components, a one-sided technical inefficiency and a two-sided random noise. In such a situation, we develop a semiparametric regression-based test and compare the technical [...] Read more.
We consider a stochastic frontier model in which a deviation of output from the production frontier consists of two components, a one-sided technical inefficiency and a two-sided random noise. In such a situation, we develop a semiparametric regression-based test and compare the technical efficiencies of the different decision-making unit groups, assuming that the production frontier function is the same for all the groups. Our test performs better than the previously proposed ones for the same purpose in numerical studies, and also has the theoretical advantage of working under more general assumptions. To illustrate our method, we apply the proposed test to Program for International Student Assessment (PISA) 2015 data and investigate whether an efficiency difference exists between male and female student groups at a specific age in terms of learning time and achievement in mathematics. Full article
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)
Show Figures

Figure 1

14 pages, 563 KiB  
Article
Robust Linear Trend Test for Low-Coverage Next-Generation Sequence Data Controlling for Covariates
by Jung Yeon Lee, Myeong-Kyu Kim and Wonkuk Kim
Mathematics 2020, 8(2), 217; https://doi.org/10.3390/math8020217 - 8 Feb 2020
Cited by 1 | Viewed by 2892
Abstract
Low-coverage next-generation sequencing experiments assisted by statistical methods are popular in a genetic association study. Next-generation sequencing experiments produce genotype data that include allele read counts and read depths. For low sequencing depths, the genotypes tend to be highly uncertain; therefore, the uncertain [...] Read more.
Low-coverage next-generation sequencing experiments assisted by statistical methods are popular in a genetic association study. Next-generation sequencing experiments produce genotype data that include allele read counts and read depths. For low sequencing depths, the genotypes tend to be highly uncertain; therefore, the uncertain genotypes are usually removed or imputed before performing a statistical analysis. It may result in the inflated type I error rate and in a loss of statistical power. In this paper, we propose a mixture-based penalized score association test adjusting for non-genetic covariates. The proposed score test statistic is based on a sandwich variance estimator so that it is robust under the model misspecification between the covariates and the latent genotypes. The proposed method takes advantage of not requiring either external imputation or elimination of uncertain genotypes. The results of our simulation study show that the type I error rates are well controlled and the proposed association test have reasonable statistical power. As an illustration, we apply our statistic to pharmacogenomics data for drug responsiveness among 400 epilepsy patients. Full article
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)
Show Figures

Figure 1

23 pages, 633 KiB  
Article
Combination of Ensembles of Regularized Regression Models with Resampling-Based Lasso Feature Selection in High Dimensional Data
by Abhijeet R Patil and Sangjin Kim
Mathematics 2020, 8(1), 110; https://doi.org/10.3390/math8010110 - 10 Jan 2020
Cited by 19 | Viewed by 4372
Abstract
In high-dimensional data, the performances of various classifiers are largely dependent on the selection of important features. Most of the individual classifiers with the existing feature selection (FS) methods do not perform well for highly correlated data. Obtaining important features using the FS [...] Read more.
In high-dimensional data, the performances of various classifiers are largely dependent on the selection of important features. Most of the individual classifiers with the existing feature selection (FS) methods do not perform well for highly correlated data. Obtaining important features using the FS method and selecting the best performing classifier is a challenging task in high throughput data. In this article, we propose a combination of resampling-based least absolute shrinkage and selection operator (LASSO) feature selection (RLFS) and ensembles of regularized regression (ERRM) capable of dealing data with the high correlation structures. The ERRM boosts the prediction accuracy with the top-ranked features obtained from RLFS. The RLFS utilizes the lasso penalty with sure independence screening (SIS) condition to select the top k ranked features. The ERRM includes five individual penalty based classifiers: LASSO, adaptive LASSO (ALASSO), elastic net (ENET), smoothly clipped absolute deviations (SCAD), and minimax concave penalty (MCP). It was built on the idea of bagging and rank aggregation. Upon performing simulation studies and applying to smokers’ cancer gene expression data, we demonstrated that the proposed combination of ERRM with RLFS achieved superior performance of accuracy and geometric mean. Full article
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)
Show Figures

Figure 1

16 pages, 438 KiB  
Article
An Estimation of Sensitive Attribute Applying Geometric Distribution under Probability Proportional to Size Sampling
by Gi-Sung Lee, Ki-Hak Hong and Chang-Kyoon Son
Mathematics 2019, 7(11), 1102; https://doi.org/10.3390/math7111102 - 14 Nov 2019
Viewed by 2015
Abstract
In this paper, we extended Yennum et al.’s model, in which geometric distribution is used as a randomization device for a population that consists of different-sized clusters, and clusters are obtained by probability proportional to size (PPS) sampling. Estimators of a sensitive parameter, [...] Read more.
In this paper, we extended Yennum et al.’s model, in which geometric distribution is used as a randomization device for a population that consists of different-sized clusters, and clusters are obtained by probability proportional to size (PPS) sampling. Estimators of a sensitive parameter, their variances, and their variance estimators are derived under PPS sampling and equal probability two-stage sampling, respectively. We also applied these sampling schemes to Yennum et al.’s generalized model. Numerical studies were carried out to compare the efficiencies of the proposed sampling methods for each case of Yennum et al.’s model and Yennum et al.’s generalized model. Full article
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)
Show Figures

Figure 1

16 pages, 1171 KiB  
Article
Two-Stage Classification with SIS Using a New Filter Ranking Method in High Throughput Data
by Sangjin Kim and Jong-Min Kim
Mathematics 2019, 7(6), 493; https://doi.org/10.3390/math7060493 - 29 May 2019
Cited by 7 | Viewed by 2929
Abstract
Over the last decade, high dimensional data have been popularly paid attention to in bioinformatics. These data increase the likelihood of detecting the most promising novel information. However, there are limitations of high-performance computing and overfitting issues. To overcome the issues, alternative strategies [...] Read more.
Over the last decade, high dimensional data have been popularly paid attention to in bioinformatics. These data increase the likelihood of detecting the most promising novel information. However, there are limitations of high-performance computing and overfitting issues. To overcome the issues, alternative strategies need to be explored for the detection of true important features. A two-stage approach, filtering and variable selection steps, has been receiving attention. Filtering methods are divided into two categories of individual ranking and feature subset selection methods. Both have issues with the lack of consideration for joint correlation among features and computing time of an NP-hard problem. Therefore, we proposed a new filter ranking method (PF) using the elastic net penalty with sure independence screening (SIS) based on resampling technique to overcome these issues. We demonstrated that SIS-LASSO, SIS-MCP, and SIS-SCAD with the proposed filtering method achieved superior performance of not only accuracy, AUROC, and geometric mean but also true positive detection compared to those with the marginal maximum likelihood ranking method (MMLR) through extensive simulation studies. In addition, we applied it in a real application of colon and lung cancer gene expression data to investigate the classification performance and power of detecting true genes associated with colon and lung cancer. Full article
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)
Show Figures

Figure 1

16 pages, 356 KiB  
Article
On the Performance of Variable Selection and Classification via Rank-Based Classifier
by Md Showaib Rahman Sarker, Michael Pokojovy and Sangjin Kim
Mathematics 2019, 7(5), 457; https://doi.org/10.3390/math7050457 - 21 May 2019
Cited by 2 | Viewed by 3477
Abstract
In high-dimensional gene expression data analysis, the accuracy and reliability of cancer classification and selection of important genes play a very crucial role. To identify these important genes and predict future outcomes (tumor vs. non-tumor), various methods have been proposed in the literature. [...] Read more.
In high-dimensional gene expression data analysis, the accuracy and reliability of cancer classification and selection of important genes play a very crucial role. To identify these important genes and predict future outcomes (tumor vs. non-tumor), various methods have been proposed in the literature. But only few of them take into account correlation patterns and grouping effects among the genes. In this article, we propose a rank-based modification of the popular penalized logistic regression procedure based on a combination of 1 and 2 penalties capable of handling possible correlation among genes in different groups. While the 1 penalty maintains sparsity, the 2 penalty induces smoothness based on the information from the Laplacian matrix, which represents the correlation pattern among genes. We combined logistic regression with the BH-FDR (Benjamini and Hochberg false discovery rate) screening procedure and a newly developed rank-based selection method to come up with an optimal model retaining the important genes. Through simulation studies and real-world application to high-dimensional colon cancer gene expression data, we demonstrated that the proposed rank-based method outperforms such currently popular methods as lasso, adaptive lasso and elastic net when applied both to gene selection and classification. Full article
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)
Show Figures

Figure 1

14 pages, 397 KiB  
Article
Skew-Reflected-Gompertz Information Quantifiers with Application to Sea Surface Temperature Records
by Javier E. Contreras-Reyes, Mohsen Maleki and Daniel Devia Cortés
Mathematics 2019, 7(5), 403; https://doi.org/10.3390/math7050403 - 6 May 2019
Cited by 14 | Viewed by 2665
Abstract
The Skew-Reflected-Gompertz (SRG) distribution, introduced by Hosseinzadeh et al. (J. Comput. Appl. Math. (2019) 349, 132–141), produces two-piece asymmetric behavior of the Gompertz (GZ) distribution, which extends the positive to a whole dominion by an extra parameter. The SRG distribution also permits a [...] Read more.
The Skew-Reflected-Gompertz (SRG) distribution, introduced by Hosseinzadeh et al. (J. Comput. Appl. Math. (2019) 349, 132–141), produces two-piece asymmetric behavior of the Gompertz (GZ) distribution, which extends the positive to a whole dominion by an extra parameter. The SRG distribution also permits a better fit than its well-known classical competitors, namely the skew-normal and epsilon-skew-normal distributions, for data with a high presence of skewness. In this paper, we study information quantifiers such as Shannon and Rényi entropies, and Kullback–Leibler divergence in terms of exact expressions of GZ information measures. We find the asymptotic test useful to compare two SRG-distributed samples. Finally, as a real-world data example, we apply these results to South Pacific sea surface temperature records. Full article
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)
Show Figures

Figure 1

Back to TopTop