The tremendous development of technology also affects medical science, including imaging diagnostics. Computer tomographs enable the non-invasive visualization of internal human organs and tissues without the need for a surgical operation. This leads to the research and development of new, more efficient, and reliable diagnostic and therapeutic procedures. Medical imaging, such as biomedical signal acquisition, plays an increasingly important role not only in diagnostics, but also in therapy, monitoring its effects, and in rehabilitation. On the other hand, the increasing data sets generated by medical diagnostic devices make it difficult to explore and analyze data in a non-automatic way—as doctors have done so far. Hence, recently, more and more attention has been paid to the design and development of automatic data analysis systems, the application of which covers various issues in the field of healthcare. Such systems can dynamically adapt to changing conditions, and thus facilitate the analysis and solving of complex problems. Such systems often implement machine learning methods.
Machine learning (ML) is a subset of artificial intelligence (AI). Algorithms are trained to find patterns and correlations in large data sets, and to make the best decisions as well as predictions based on the results of such analysis. Machine learning systems become more effective over time, and the more data they have access to, the more accurate they are. Nowadays, deep learning methods are also often used in medical imaging [
1]. Deep learning is a part of machine learning. It is based on complex artificial neural networks. The learning process is deep because the structure of artificial neural networks consists of many input, output, and hidden layers, which are often interconnected. Deep networks achieve much better results in terms of the recognition, classification, and prediction of medical data compared to classical machine learning algorithms.
Thanks to ML technology, including DL, health care workers, including doctors, can cope with complex problems that would be difficult, time-consuming, and ineffective to solve on their own. This Special Issue includes 10 publications that discuss the use of broadly understood machine learning methods for processing and analyzing biomedical signals and images coming from many medical modalities. The use of these methods allows a better understanding of the principles of the human body functioning at various levels (cellular, anatomical, and physiological) by providing additional, quantitative, reliable data extracted from medical data.
Ihsanto [
2] proposes an algorithm developed for automated electrocardiogram (ECG) classification. ECG is a popular biosignal in heart disease diagnostics. However, it is non-stationary; thus, the implementation of classic signal analysis techniques (such as time-based analysis feature extraction and classification) is rather difficult. Thus, a machine learning approach based on the ensemble of depthwise separable convolutional (DSC) neural networks for the classification of cardiac arrhythmia ECG beats was proposed. This method reduces the standard path of ECG analysis (QRS detection, preprocessing, feature extraction, and classification) to two steps only, i.e., QRS detection and classification. Since feature extraction was combined with classification, no ECG preprocessing was required. To reduce the computational cost and maintain method reliability, All Convolutional Network (ACN), Batch Normalization (BN), and ensemble convolutional neural networks were implemented. The developed ensemble of deep networks was validated using the MIT-BIH arrhythmia database. The obtained classification results (16 class problem) resulted in sensitivity (Sn), specificity (Sp), and positive predictivity (Pp), and accuracy (Acc) equal to 99.03%, 99.94%, 99.03%, and 99.88%, respectively. It was demonstrated that presented classification quality measures outperformed other state-of-the-art methods.
Biomedical signals are often used for the design and development of human–machine interfaces, which is emerging branch biomedical engineering. Borowska-Terka [
3] proposes such a system dedicated to persons with disabilities, which that is a hands-free head-gesture-controlled interface. It can help, for example, paralyzed people to send messages or the visually impaired to handle travel aids. The system contains a small stereovision rig with a built-in inertial measurement unit (IMU). To recognize head movements, two methods are considered. In the first approach, for various time window sizes of the signals recorded from a three-axis accelerometer and a three-axis gyroscope, selected statistical parameters were calculated. In the second technique, the direct analysis of signal samples recorded from the IMU was performed. Next, the accuracies of 16 different data classifiers for distinguishing the head movements: pitch, roll, yaw, and immobility were evaluated. The highest accuracies were obtained for the direct classification of unprocessed samples of IMU signals and with the use of SVM classifier (95% correct recognitions), while the random forests classifier reached 93%. Such results indicate that a person with physical or sensory disability can efficiently communicate with other people or manage applications using simple head gesture sequences.
MRI is one of most common imaging modalities used in diagnostics and treatment planning. Klepaczko [
4] shows an application of dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) for visualizing and quantifying kidney perfusion, which is one of the most important indicators of an organ’s state. In clinical practice, kidney function is assessed by measuring the glomerular filtration rate (GFR). Estimating the GFR based on DCE-MRI data requires the application of an organ-specific pharmacokinetic (PK) model, but the determination of the model parameters is sensitive to the determination of the arterial input function (AIF). Thus, a multi-layer perceptron network was proposed for PK model parameter determination. As a reference method, the trust-region reflective algorithm was used. The efficiency of the proposed approach was tested for 20 data sets, collected for 10 healthy volunteers whose image-derived GFR scores were compared with ground-truth blood test values. The achieved mean difference between the image-derived and ground-truth GFR values was 2.35 mL/min/1.73 m
2, which is comparable to the result obtained for the reference estimation method (−5.80 mL/min/1.73 m
2). It was demonstrated that the implemented neural networks ensure agreement with ground-truth measurements at a comparable level. The advantages of using a neural network are twofold. First, it can estimate a GFR value without the need to determine the AIF for each individual patient. Second, a reliable estimate can be obtained, without the need to manually set up either the initial parameter values or the constraints thereof. After further validation and more exhaustive patient testing, the proposed approach can be implemented in clinical practice.
Another important application of ML to fundus blood vessel image segmentation is presented in [
5]. Such analysis is important in the diagnosis and treatment of several diseases, such as hypertension, coronary heart disease, and diabetes. The analysis of such images is rather complicated, and classic algorithms suffer from relatively low segmentation accuracy. Thus, an improved U-shaped neural network (MRU-NET) segmentation method for retinal vessels was proposed. After the application of the image enhancement algorithm, the image contrast is improved. Applied random segmentation divides the image into smaller image blocks that help to reduce the complexity of the U-shaped neural network model. Next, the residual learning is introduced into the encoder and decoder to improve the efficiency of feature analysis. Finally, a feature balancing module is implemented followed by the feature fusion module that is introduced between the encoder and decoder to extract image features with different granularities. The developed segmentation technique was tested on the DRIVE and STARE datasets. It was demonstrated that the obtained accuracies (96.11% and 96.62% for DRIVE and STARE, respectively) and sensitivities (86.13% and 78.87%, respectively) outperform other state-of-the-art methods presented in the literature.
A computer tool dedicated to the comprehensive analysis of lung changes in computed tomography (CT) images is described in [
6]. The proposed system enables the analysis of the correlation between the radiation dose delivered during radiotherapy and the density changes in lungs caused by the fibrosis. The input data, including patient dose, are extracted from the CT images coded in DICOM format. The convolution neural networks are used for CT processing. Next, the selected slices are segmented and registered by the developed algorithms. The results of the analysis are visualized graphically, enabling, for example, the presentation of dose distribution maps in the lungs. It is expected that, thanks to the developed application, it will be possible to demonstrate the statistically significant impact of low doses on lung function for a large number of patients.
Some areas of computer-aided diagnosis benefit from worldwide competitions for research teams mainly due to the access to extensive and well-annotated datasets. The study of Sage and Badura [
7] addresses the detection of five subtypes of intracranial hemorrhage (ICH) in head CT. The authors trained and validated their tools using a public database provided by the Radiological Society of North America for the international ICH detection competition. The set used in [
7] consists of over 372k images from ten thousand patients. Five separate models of a dedicated double-branch architecture were trained to detect every ICH subtype. Each ResNet-50-based branch analyzes two artificial RGB images: one reflecting the original CT slice in three different intensity windows and the other combining the image with its neighbors in the series. The authors report the results produced by each separate branch, but the best results came from another framework. Features extracted by ResNet-50 cores are concatenated and passed to a classifier: either an SVM or a random forest. Depending on the ICH type, the F1 score reaches from 75.3 to 96.6%, surpassing other studies in the field thus far.
The ResNet-50 architecture is also the core of a method proposed by Song et al. [
8] for the automated detection of cephalometric landmarks in head X-ray images. Again, the study takes advantage of a public database for an international ISBI Grand Challenge in dental X-ray image analysis. However, the authors support the investigation with their own dataset this time. The analysis is performed within patches extracted from the X-ray image at a registration-based coarse landmark localization stage. The ResNet-50 architecture is replicated and trained nineteen times to address each of nineteen defined landmarks separately. The fully connected layer works in a regression mode to estimate the x and y coordinates of the landmark with a two-argument mean squared error (MSE) loss function. The authors assess their methodology using a radial detection error and a successful detection rate with an assumed tolerance. Despite a relatively small training dataset, the method can detect most landmarks accurately, especially in the ISBI dataset images. The comparison with some state-of-the-art methods favors the current study in most evaluation metrics. The authors indicate that their approach still has room for improvement in terms of computational time, mainly in the initial automated registration.
Mazur-Milecka et al. [
9] applied deep learning models to segment laboratory rodents in thermal images. Their study aimed at the efficient and non-intrusive monitoring of animals’ activity and physiological changes, also in low light conditions. The image data were captured at 60 frames per second by a thermal camera over a cage with two rats. The authors implemented and investigated two approaches: a two-stage detect-then-segment Mask R-CNN and a single-stage TensorMask network. Various training schemes and parameter setups were used, including the transfer learning of models pre-trained on public datasets. The TensorMask model produced the most accurate results when pre-trained using visible light images and then trained with thermal sequences (mean average precision over 90%). The authors also verified possible benefits from inputting alternative images created by rescaling the original thermal intensities in various automatically adjusted ranges of animal temperature. However, they did not find any such modifications improving the segmentation accuracy. In general, the single-stage methods performed better than two-stage ones if pre-trained; the opposite was true in the case of models trained from scratch.
Huang et al. [
10] designed a prediction system for analyzing the risk of suffering amyotrophic lateral sclerosis (ALS) based on combinatorial comorbidity patterns. They employed medical history data gathered in electronic medical records (EMR) to support the identification of high-risk ALS subjects in the early stage. One of the main contributions of the study is the weighted Jaccard index (WJI) defined to balance the impact of particular comorbidities in predicting the output risk. Comprehensive patient-group-wise distributions were determined over hundreds of individual- and mid-level diseases characterized over the EMR database. This resulted in the determination of WJI weights. Four machine learning models were trained and validated: logistic regression, random forest, SVM, and XGBoost. The first two were considered most efficient based on seven canonic classification assessment measures, including accuracy exceeding 80%. In general, the individual-level disease classification turned out to be more efficient than the mid-level analysis, and the WJI proved its advantage over the pure Jaccard index in the ALS risk prediction.
Finally, Loh et al. [
11] delivered a systematic review of the automated detection of sleep stages using deep learning models from the last decade. The authors summarize 36 studies from 2013 to 2020 that employ various deep models to analyze polysomnogram (PSG) recordings. After providing some medical and machine learning background and introducing five or six sleep stages, depending on the assumed standard, they analyze their detection and classification from different points of view. First, the models and architectures are introduced (convolutional or recurrent neural networks, long short-term memory, autoencoders, and hybrid models). Second, the available databases are described. Then, deep learning approaches addressing the sleep stage analysis are presented and compared. Finally, the authors discuss the topic in detail and draw conclusions. Even though electroencephalography (EEG) seems to be the most widely used signal in sleep stage detection, the current review states that it may not be able to work efficiently enough: automated systems should also involve other PSG recordings, e.g., electrooculography (EOG) or electromyography (EMG).
The variety of topics covered by the articles in this Special Issue is yet another proof of enormous opportunities for biomedical engineers and scientists familiar with artificial intelligence, machine, and deep learning. These techniques respond to the emerging needs of computer-aided diagnosis and therapy. Their development follows the explosion of computing power; hence, the enormity of diagnostic, signal, or image data has a chance to be processed, pre-analyzed, and presented to the physician within a reasonable time. From predictive models based on classification and regression models to automated multi-scale feature extraction from 2D, 3D, and even higher-dimensional data structures, contextual analysis, segmentation, this is the future of the software branch of biomedical engineering. The tools themselves are becoming increasingly reliable through the development of the so-called explainable artificial intelligence (XAI). All these issues guarantee that the broad biomedical application of machine learning is far from exhausted.