1. Introduction
Malaria is a transmissible and life-threatening disease, as professed by the World Health Organization (WHO), and is prevalent in many countries, especially Senegal and those in Asia and Africa [
1]. The primary cause of malaria is parasitesrecognized as Plasmodium, which infect human red blood cells (RBCs). Malaria is transmitted via mosquitoes, inside which the parasite lives. Once a mosquito bites a person, the person becomes infected with malaria [
2]. In 2019, the WHO stated that 229 million cases of malaria were diagnosed, and in the same year, the number of deaths reached 409,000 [
3].
It is worth noting in this report that the proportion of children under the age of five out of the total number of people who died of malaria reached 67% (274,000), which indicates that malaria is one of the most-significant causes of death among children in the world. In the last few years and during the COVID-19 pandemic, the number of malaria patients rose, which is a dire circumstance [
3,
4]. Therefore, malaria remains an acute health concern in large parts of the world, in particular in developing countries. As such, an automated diagnostic method that can decrease the time and cost of the recognition of malaria with good diagnostic performance becomes imperative.
In detecting malaria, thin and thick blood smears are usually taken from possibly infected people and examined on glass slides. These smears are viewed under a light microscope for surveillance. This diagnostic technique requires a high level of expertise to achieve precise results. Moreover, further complications regarding disease detection and medication probably lead to inconsistent and delayed results due to inadequate tools and proficiency in developing regions, as well as many other circumstances [
5].
Artificial intelligence comprises machine learning and deep learning neural networks that could be used to handle current problems in healthcare around the world [
6]. The current systems related to artificial intelligence in medicine already involve various tasks, from image segmentation to biometric measurement. However, the construction and training of a high-tech machine learning model require an in-depth understanding of the mathematical and engineering concepts of artificial intelligence, as well as choosing suitable algorithms and tuning the hyperparameters of the models. This is usually an arduous task for many proficient engineers and computer scientists, let alone healthcare specialists with restricted experience in computer science. Regarding malaria parasite detection using machine learning techniques, much work has been performed. For example, in [
7], the authors used powerful classifiers (namely ResNet and DenseNet) via the transfer learning technique to classify cell images as either parasitized or uninfected. Another approach to tackling the problem of detecting malaria in blood images can be found in [
8], where the work was based on different machine learning methods (decision tree, support vector machine, naïve Bayes, and K-nearest neighbor), which were fit on six features extracted by the VGG16, VGG19, ResNet50, ResNet101, DenseNet121, and DenseNet201 models. Such a manual approach to machine learning for cell image classification of malaria requires experts in machine learning to prepare the model. However, the works [
7,
8], as well as the greatly different works on machine learning (ML) adopted for the malaria detection problem [
9] suffer from different issues: the first one is that human effort is required in order to extract the most-valuable features to feed the classifiers efficiently. Furthermore, this approach requires specialists in ML and DL to build a robust model, in addition to choosing the best hyperparameters of the model to distinguish the malaria images from the normal ones. Therefore, AutoML Vision (Autokeras) could be a possible solution to mitigate these issues, as it has a highly friendly user interface. AutoML Vision offers a highly automated model improvement framework to help people with little computer encoding knowledge or experience build and train their own machine learning models.
The automated machine learning methods to detect malaria reduce the human effort in the construction of networks as an alternative to creating classifier networks from scratch and modifying the parameters. This work was based on a novel technique that automatically discovers the best deep learning model to classify cells into parasitized or healthy cells. No experience in coding is required. It depends on a few lines of code to function. Large technology companies have made a respectable effort to disseminate various automated machine learning software, such as Google Cloud AutoML and Azure automated machine learning. The benefit of such services is the ability to build high-performance models with as few experts in machine learning as possible. Moreover, several of the open-source software frameworks also achieve high performance, for example AutoWeka [
10], AutoSKLEARN [
11], Autokeras [
12], and Teapot. Unfortunately, the above-mentioned services are restricted when applied. For instance, they are suited only to a specific problem, e.g., natural language processing (NLP), image classification, or speech recognition [
13].
Figure 1 illustrates the AutoML algorithm.
We sought in this paper to study the efficiency and performance of auto-machine learning systems in the medical field. As far as we know, this is the first paper to test the power of auto-machine learning (AutoML) by implementing Autokeras, an automated deep learning network, to classify cell images as infected or not with the malaria parasite. Autokeras’s deep learning model was preferred over other approaches to automatic machine learning for several reasons mentioned in
Section 3.1. The following points describe the main contributions of this research:
This work represents a pioneering work in classifying images of malaria-infected and normal blood cells by the AutoKeras software, one of the auto-machine learning systems.
We propose preprocessing on malaria image datasets before applying the Autokeras model.
We chose the best model out of 20 trials performed by the Autokeras software to search for the best network that gave the lowest validation loss.
We demonstrated the high performance of the Autokeras software in detecting malaria-infected cells and its superiority over the traditional deep learning (DL) models, which require machine learning experts.
The results of the proposed method were applied to various medical problems, and the results were compared to other methods. The proposed method obtained better results in solving the test problems than the other methods selected from the literature.
The remainder of this paper is organized in the following manner.
Section 2 lists the main related works in the literature that present the malaria diagnosis paradigm. The adapted Autokeras model is introduced in
Section 3. Then, the evaluation metrics used are illustrated in
Section 4. Our proposed methodology is discussed with the details of the implementation in
Section 5. Lastly, the work is concluded with an outlook on future works in
Section 6.
2. Related Work
In this section, the related works that have used deep learning and machine learning techniques are presented as follows. The immense capability of convolution neural networks (CNNs) in image detection in various fields has been demonstrated. Several studies have been conducted on malaria parasite detection, which can be summarized as follows. In [
14], the authors used a deep CNN to automatically discover malaria in thin blood smear images by proposing an entire computer-aided diagnosis structure. In order to optimize the process of feature selection, they used the transfer learning technique. By using the feature matrix in the intermediate layers, the layer embedding was removed from the intermediate convolutional layers as an additional layer of security. The proposed transfer learning technique exploited the ResNet 152 network combined with the deep greedy model for fitting.
In [
15], the study was based on the proposed data augmentation convolutional neural network (DACNN) deep learning model, which used the reinforcement learning technique to solve such issues. They compared the effectiveness of their suggested DACNN with others: CNN and directed acyclic graph convolutional neural network (DAGCNN). By the experiment’s test result, they proved that their DACNN outperformed previous works in treating and classifying the images, where the DACNN obtained a 94.79% accuracy.
In [
8], the authors proposed an approach based on using the features of ResNet50, ResNet101, VGG16, VGG19, DenseNet121, and DenseNet201. Then, they used machine learning models based on decision tree, support vector machine, naïve Bayes, and K-nearest neighbor to identify the malaria parasite in the cell images. The results showed that their proposed model could successfully detect the disease in the dedicated dataset with an accuracy of 94%.
In [
16], the PlasmodiumVF-Net framework was introduced to determine whether an image of a patient sample showed infection. If malaria infection was confirmed, another classification was performed to specify if the individual was infected with Plasmodium falciparum or Plasmodium vivax. The work was grounded on the mask regional convolutional neural network (Mask RCNN) and the ResNet50 classifier. They used a dataset containing 6000 images, which they made publicly available. Their framework achieved an accuracy of 90%.
In [
17], the authors used a transfer learning approach to identify images of malaria parasite infection by integrating the current Visual Geometry Group (VGG) network and support vector machine (SVM). This hybrid approach was executed by training the topmost layers and freezing the remaining layers. First, the “k” layers of the VGG model were reserved, and the (n − k) layers were replaced bySVM. Finally, the experimental results showed that this combination of VGG19 and SVM achieved a classification accuracy of 93.1%
The work [
18] was based on the benefit of a pretrained deep convolutional neural network algorithm for the detection of malaria in images. They achieved a detection accuracy of 93.89% and 95.20% by using the GoogLeNet and ShuffleNet V2 models, respectively. Moreover, they proved that the ShuffleNet V2 model was three-times faster than GoogLeNet in the training.
In [
19], the authors built a DBN to classify 4100 blood smear images into the classes: parasite or not. The suggested DBN was built utilizing the contrastive divergence method for pretraining by stacking limited Boltzmann machines. They found that stacking 800 layers could achieve a 96.32% accuracy. According to the imbalanced dataset used in this paper (669 parasite and 3431 non-parasite), the F1-metric, which was best suited to the performance evaluation, reached 89.66%.
Another approach using in an imbalanced dataset can be found in [
9], where the work was based on building a convolutional neural network. The authors aimed to predict the existence of malaria-infected cells using images obtained by microscopy of thin and thick peripheral blood smears. They also used a transfer learning model to compare the proposed model against the pretrained models. They achieved an accuracy, precision, and sensitivity of 96.97%, 97.00%, and 97.00%, respectively.
A computer-aided design was proposed in [
20] to recognize malaria infections from blood images obtained by microscopy. The bilateral filtering process was used in the suggested method to eliminate the noise and increase the image quality. The image processing techniques of adaptive thresholding and morphological operations were utilized to discover the malaria infection inside an individual cell. Their method achieved a detection accuracy greater than 91%.
In [
21], the work was based on suggesting a customized CNN. The authors exploited image augmentation techniques and bilateral filtering to extract the features of red blood cells before passing them to the model for training. According to the data augmentation techniques, the number of data was increased from 27,558 to 173,700 images. Therefore, the model was expected to be more generalized and more accurate (accuracy equal to 96.82).
The work in [
22] adopted the problem of diagnosing malaria infection from blood cell images. To tackle the problem, the authors suggested a multiheaded-attention-based Transformer model. In order to illustrate the efficiency of the model, they used the gradient-weighted class activation map (Grad-CAM) technique. This technique is responsible for recognizing the parts of an image that need much more attention than the others.
In [
23], the authors’ proposed model involved three convolutional dense layers and one fully connected layer. The neural network was a sequence of multiple convolutional layers using several existing filters in the layers, resulting in a reasonable accuracy. Model training was implemented, and several blood image datasets served to assess their proposed model’s accuracy. The CNN was implemented using restricted computational resources, resulting in an accuracy of 95%.
The authors of [
24] used the EfficientNetB0 model to classify blood cell images as infected or not. To decrease the time and boost the original feature sets, their work was based on fine-tuning (transfer learning). Their proposed EfficientNetB0 model obtained an accuracy of 94.70% after training for fifty epochs. As stated in their work, they verified that the transfer learning of the EfficientNetB0 model could obtain good accuracy in recognizing malaria-infected cells in blood smear images without requiring preprocessing, data augmentation, or other techniques.
The work in [
25] was based on proposing a convolutional neural network (CNN). The construction of this CNN involved 20 layers, which was essentially required to differentiate between infected cell images and healthy ones. Their work was based on using the same dataset (LHNCBC) that we used in this study, which contains 27,558 cell images. They achieved a 95.28% overall accuracy in the experimental test result.
The proposed work by [
26] relied on two phases. The first phase used a graphical user interface to detect infected and fragmented red blood cells. The second phase determined whether the thin blood smearcell images presented infection, for which a feed-forward neural network (FNN) was proposed. The dataset used in this paper consisted of 27,560 benchmark images. Their proposed technique achieved a 92% accuracy in the testing phase.
However, not all current deep learning (DL) models can be readily implemented for malaria diagnosis systems, as they require extra effort and may be considered as a technical challenge, as in the previous solutions of deep learning works. Moreover, constructing a high-efficiency DL model for a particular task can face issues that make these processes more complex, such as being resource-expensive, time-consuming, and depending heavily on human experience through a trial-and-error method. To this end, Autokeras provides a promising alternative to the manual DL solutions and requires little effort or involvement of the user, as it excels in providing the best-performing deep learning network along with the feature engineering (mining, choosing, and building) and network construction (hyperparameter choice and fine-tuning).
4. Model Evaluation
The confusion matrix (CM) (an error matrix) can be defined as a concise table or tool that reveals how well the model can predict from a particular testing dataset. A CM consists of rows and columns giving the labels of the ground truth and the predicted class. The ground truth is the actual infected and uninfected blood cells. On the other hand, the predicted values specify the number of correct and incorrect classifications made by the model. The evaluation measures used in the confusion matrix are presented as follows:
True positives (TPs) signify the amount of correctly predicted positive samples.
True Negatives (TNs) identify the number of correctly predicted negative samples.
False Positives (FPs) are cases where the images were predicted as positive, but were not
False Negatives (FNs) are cases where the images were predicted as negative, but were not.
Accuracy is a measure of all correct predictions from all the samples sets and is determined by Equation (
5). We calculated the model’s accuracy, precision, recall, and F1-score as the metrics. The following illustrates the concepts and the equation for each metric.
Precision is the number of times the model made a correct prediction out of an actual class, determined by Equation (
6).
The recall is known as the true positive rate (TPR) or the sensitivity. It implies the percentage of accurately predicted infected cells or TPs to all TPs and FNs and is determined by Equation (
7).
The F1-measure is a weighted average between the recall and precision, which can have a maximum score of 1, which is the best case, and the lowest score of 0, and it is determined by Equation (
8).
6. Result and Evaluation
As we mentioned earlier, the metrics used to assess the performance of the Autokeras model were the accuracy, precision, recall, and F1-measure, which are illustrated in detail in
Section 6.
Figure 6 illustrates the best network architecture, which showed high classification performance provided by the AutoKeras model from 20 trials, which is the number of networks assessed by AutoKeras (i.e., 20 trials = 20 networks). This highest-performance network was saved and used for the prediction task. Once the prediction task was accomplished, the confusion matrix metrics were used to assess the performance of the best Autokeras model.
Figure 7 shows the confusion matrix. As is clear from
Figure 7, the vertical axis indicates the target class (actual label), while the system’s predicted class (output label) is presented along the horizontal axis.
These four metrics aimed to better describe the adopted model’s evaluation results, which can be derived from the confusion matrix. A complete review of the performance evaluation metrics is offered in
Table 2. As is clear, the Autokeras model had a promising ability to distinguish between the images of the infected thin blood smear cells from the uninfected ones with an impressive accuracy of 95.6%, precision of 95.5%, recall of 95.7%, and F1-score of 95.6%. The accuracy and loss schemes during the 20 epochs are illustrated in
Figure 8. It is clear that a 95.6% accuracy was obtained after ten iterations. Furthermore, we show the evaluation metrics’ results for the three best networks obtained after 20 AutoKeras trials in
Figure 9. As is clear from the figure, we adopted the first network for our prediction results over the malaria testing dataset.
Despite the large size of the malaria dataset of the images of the thin blood smear cells, Autokeras’ proposed model proved its robustness in classifying the cell images into infected or uninfected with malaria.
The Grad-CAM algorithm was applied in order to show a visualized localization of prominent features appearing on the parasitized blood cell. This approach helped add transparency to our outcomes. Grad-CAM is a generic method that uses a CNN as the final activation layer. As such, a set of high-level features is produced in the shape of a heat map. In
Figure 10, the parasite detection using the Grad-CAM algorithm is presented for the proposed model. As is clear from the figure, the proposed model kept its attention on the critical areas. Moreover, it produced the most-compelling interpretation and -precise revelation of the parasites, as the hot region better preserved the infection’s localization.
Table 3 introduces a complete comparison between the results using the Autokeras model against the results of the other related works. Five benchmarks were used to evaluate the prediction precision for each model, specified as follows: the publication title, publication year, number of images in the dataset used, the approaches used, and the overall accuracy. The good accuracy, which was beyond 90% for all the models, indicates each model’s efficiency in the malaria classification tasks.
As is clear in
Table 3, the Autokeras model outperformed the previous works in recognizing malaria disease from images of blood smear cells. Moreover, the adopted Autokeras model proved its advantages over some works, such as those in [
8,
15], which were performed on the same dataset, with an accuracy of 94.79% and 94%, respectively. It is important to note that some works such as [
19] produced more accurate models, for which they achieved 96.32%. However, it is worth knowing that they used an imbalanced dataset. Therefore, accuracy is not the best metric in such a case. Therefore, the F1-score can be a more suitable metric. In [
19], their model produced an F1-score equal to 89.66%.
Moreover, the papers in [
9,
21] demonstrated better performance than ours due to the augmentation techniques. Therefore, the size of the dataset increased, which may help the model generalize better (more data result in a more accurate model [
32]). In our work, we did not augment the dataset, which mimics the real-world scenario. Furthermore, the proposed work can produce a robust model that can achieve approximately 1 percent less than their works.
In conclusion, this paper applied a novel optimization technique to tune the hyperparameters of the machine learning techniques, called automated machine learning. The proposed method proved its ability to obtain the optimal parameter values, which better reflect the results of the utilized evaluation measures compared to the other methods from the literature.
7. Conclusions
This work adopted an open-source automatic AutoKeras deep learning model to identify malaria parasites in blood cell images. AutoKeras focuses on techniques for automatically determining the highest-performing model for a given dataset. This theme has received attention recently and has been developing quickly ever since. However, the primary motivation for automating the hyperparameters’ tuningin any model is to find an automatic ML solution that can achieve a high degree of performance accuracy and lead to an innovative level of automation in the future using artificial intelligence.
The proposed Autokeras malaria classification approach achieved the best overall accuracy of 95.6%. It was evident from the experimental test implemented that the robustness of Autokeras outperformed previous works from some studies. This proved that the proposed method can solve other similar problems efficiently according to the results of the comparisons with the the-stat-of-the-art methods.
Other machine learning methods combined with advanced optimization algorithms can be used to solve the hyperparameter problems in any model and find a robust automatic ML method. Moreover, other applications can be addressed using the proposed method, such as brain medical images, patient risk identification, pancreatic cancer, sarcoma, pattern imaging analytics, stomach cancer, clinical trial research, predicting epidemics, kidney cancer, maintaining healthcare records, and others.
The limitations of using auto-machine learning are that it requires much time to train with such a limited platform and requires a faster device. We recommend using a faster device in such cases.