1. Introduction
Sickle cell disease (SCD) is an inherited hemoglobin disorder condition most commonly found in sub-Saharan Africa, India, the Mediterranean and the Middle East population [
1]. Characteristics of the disease include chronic hemolytic anemia, acute pain, organ damage and significantly shorter lifespans [
2]. The pathophysiology of SCD mainly involves the rapid polymerization of hemoglobin (HbS) after deoxygenation, causing deformation of red blood cell (RBC) morphology and disrupting blood circulation [
3]. Consequently, vaso-occlusion occurs due to the inflexibility and highly adhesive nature of sickled RBCs [
3]. Organ failures are common complications of SCD from repeated vaso-occlusive processes and often lead to increased mortality [
4,
5].
High mortality rates are particularly prevalent in the under-five SCD age group, constituting 50–90% of newborns in sub-Saharan Africa with the β-globin S gene mutation [
6]. In North America, mortality rates for SCD-related complications have nearly been eliminated due to the availability of comprehensive blood tests and newborn screening protocols [
6,
7]. However, access to specialized SCD diagnosis laboratory equipment such as high-performance liquid chromatography or electrophoresis is extremely limited in low-resource countries, and thus, a sickle solubility test is the most affordable and available testing method [
8]. The test looks for the presence of deoxygenated HbS precipitate in phosphate buffer, formed from reactions induced by sodium bisulfite. While a ‘positive’ test result confirms the presence of HbS, it is incapable of providing detailed information on the percentage of sickling cells in a heterogeneous blood sample and is not effective for HbS levels lower than 15–20% [
9]. More accurate and accessible diagnostic tools are needed for SCD management.
Healthy and sickle cell trait (SCT) individuals generally have normal hemoglobin levels, normal RBC shape, and have no significant clinical or hematologic manifestations [
10,
11]. The defining features of SCD on a microscopic level is the gradual morphological deformation of select sickle cell erythrocytes due to the progressively increasing polymerization of HbS under low oxygen tension conditions [
12]. Morphological and biomechanical profiling of single sickle cells has been accomplished using quantitative phase microscopy (QPM), a powerful label-free imaging modality [
13,
14]. Yet, the application of QPM for cytological diagnosis has been limited by low imaging cell counts. Recent advances in QPM have led to higher imaging cell counts in an effort to provide a more complete view on overall cell population morphology and, thus, bring more diagnostic value. For example, high-throughput QPM is achieved via the fast-scanning time-stretch approach, producing a state-of-the-art throughput of 77,000 cells/second [
15]. However, these results are achieved at the expense of instrument complexity, large amounts of computing power and lengthy image reconstruction times. As an alternative, we recently introduced holographic cytometry (HC) as a simple low-cost, high-throughput QPM system, based on an off-axis Mach–Zehnder configuration that only requires phase reconstruction times in the order of milliseconds [
16,
17].
In this study, we seek to evaluate HC’s clinical applicability to SCD screening and monitoring. Millions of single cell images extracted from just a few drops of blood are imaged using HC in a few minutes and passed through a machine learning algorithm for detailed diagnosis. Here, we advance the algorithmic approach to improve identification of cells from SCD patients. While cells from a given SCD patient sample can exhibit a wide range in degree of cellular deformation [
18,
19] due to the heterogeneity in SCD cellular morphology, here, we define a set of selection criteria to refine a training dataset that only contain extremely sickled cells. Upon using this filtered dataset for training, we developed deep learning algorithms that can accurately predict the percentage of severely sickled cells in unknown patient samples.
3. Materials and Methods
3.1. System
The experimental setup consists of the HC imaging system [
20] accompanied by artificial intelligence algorithms to realize SCD diagnosis. Shown in
Figure 4, the imaging system is a Mach–Zehnder interferometer that consists of pathlength-matched sample and reference arms, where images of RBCs flowing within microfluidic channels are captured in the form of single-cell holograms. The overall magnification of the system is 33×, and the field of view covers 16 channels at once. The camera (Dalsa HS-40-04K40-00-R) acquires 300 frames per second synchronized to an acousto-optic modulator that pulses a 640 nm wavelength laser (PicoQuant Fast Switched Diode Laser FSL 500). The 350μs pulses prevent streaking effects by minimizing the blurring due to motion of the flowing cells.
3.2. Image Reconstruction
Single-cell phase images are reconstructed from interferograms using Fourier transform, phase unwrapping, digital refocusing and segmentation algorithms, at rates up to 150 ms/cell [
20]. Twenty-five morphological parameters are extracted from each single-cell image. Customized exclusion parameters are used to eliminate images of cell clumps and debris from the dataset. Normal cell images with mean phase values below 0.4 rad and SCD cell images with mean phase values below 0.3 rad are excluded from the final dataset. In total, 1981 SCD images and 5552 healthy images were excluded from our analysis.
3.3. Microfluidics
Customized lithographic patterns are fabricated onto blank Si wafer disks through SU-8 etching process. The finished Si mold is used to form microfluidic channels (
Figure 4) using polydimethlsiloxane (PDMS), mixed at a 10:1 ratio with PDMS curing agent. The mixture is baked in an oven at 85 °C for two hours, allowing the channels to cure and solidify. Subsequently, the cured PDMS slabs are then plasma bonded to glass coverslips in a reactive ion etcher chamber. Entry and exit ports to the channels are punctured onto PDMS slab prior to bonding. During each data acquisition session, flow rate is set to 10 μL/min.
3.4. Blood Samples
Fresh packed RBCs (pRBCs) from 2 healthy donors and 3 SCD donors (
Table 6) were purchased from an external vendor (BioIVT) for the purpose of this study. An amount of 50 μL of pRBCs was suspended in 5 mL of 20% bovine serum albumin (BSA) solution and pumped into the microfluidic channel using a syringe pump. Archived RBC HC imaging data (
Table 7) from our 2021 study, which were analyzed here, were also processed under the same protocol [
17].
3.5. Selective Search and Training Set Refinement
In order to construct an automatically labelled ground truth training set, with minimal errors, a search criterion has been established to define features which differ at a ratio of 21:1 for the percentage of SCD population that shows this characteristic versus percentage of normal population above/below the searched threshold value. Previous studies have shown that the logistic regression achieves the highest training accuracy levels when trained with datasets with minimum size of 5000 images [
17]. The cutoff ratio is set based on finding the balance point where there is a high selectivity yet maintains a reasonable diseased cell count (at least 5000 cells) in the final combined dataset. We term these tails as critically sickled cells. The final training set consists of sample 1 and sample 2 data that are filtered using the search criteria.
For each morphological parameter, a population-wide search is conducted to find the histogram tail threshold where the search criteria can be satisfied (
Figure 5). As an example, 0.8118 radians is the threshold calculated for the maximum phase histograms. For every 21 sickle cells that are below this maximum phase threshold, only 1 healthy cell will exist in the same regime. Several of the morphological parameters, including max phase, standard deviation of phase, top 25% optical path length, max phase gradient, eccentricity and elongation ratio all had tail distributions that satisfied the search criterion. These tail subsets are grouped together, and the union of the sets yielded 18,925 SCD cell images for training. Another 18,925 normal cell images were randomly selected from Sample 4 and Sample 5 to add to the training set. Test datasets consist of all archived data shown in
Table 7 and Sample 3 SCD data.
Figure 5 presents a graphical overview of the tail selection process.
3.6. Logistic Regression and Deep Learning
Two different training sets were used for constructing two different variants, each of LR and CNN models. LR-ALL and CNN-ALL models were trained using unrefined data from SCD and normal samples, whereas LR-SEL and CNN-SEL models were trained using refined data from the above-described selective search process.
Morphological parameters extracted from samples 1, 2, 4 and 5 were used for training the LR algorithms to distinguish between SCD and normal. The testing dataset consisted of morphological parameters extracted from samples A–E and sample 3. In total, the LR-ALL model was trained by an unrefined dataset of 2 × 100,000 × 25 parameters (#classes × #cells × #parameters), and the LR-SEL model was trained by a refined dataset of 2 × 18,925 × 25 parameters.
Rather than first extracting morphological parameters, single-cell images from samples 1, 2, 4 and 5 were used for training the deep learning neural networks. Single-cell images from samples A–E and sample 3 were used for testing the deep learning neural network’s performance. Overall, CNN-ALL was trained on an unrefined dataset of 2 × 100,000 (#classes × #cells) cell images, and CNN-SEL was trained on a refined dataset of 2 × 18,925 cell images to make inferences on 6 unknown patient samples.
4. Discussion
We observe that the sickle sample and normal sample morphological parameters histograms are mostly overlapped; however, a nontrivial number of the sickle cells in the distribution exhibit nearly no phenotypic similarities to healthy cells. This unique subset of sickle cells can be extracted from the distribution through implementing a 21:1 SCD-to-normal ratio thresholding criterion. As described above, this means that each criterion is defined by the region in the histogram where the population of SCD cells have 21× greater incidence of that chosen parameter than the normal cell population. The extraction of morphological parameters that uniquely identify sickling cells is a necessary step to constructing a meaningful, pure SCD ground truth set. Without thresholding, the large overlap between the morphological parameters of sickle and normal cells sample will confound the discrimination capacity of a classification algorithm. Furthermore, through refining the training set, we have narrowed down the focus of the classification algorithm to specifically differentiate SCD cells from healthy cells and greatly increase accuracy. The near ideal ROC curves (
Figure 2B and
Figure 3B) for LR and CNN algorithms when trained on the refined data, indicates that the algorithm practically predicted no false positives and would excel at identifying critically sickled cells. Overall, we have developed a group of metrics that can delineate critically sickling cells in a heterogeneous SCD sample.
When training is switched from the unrefined to refined dataset, significant improvements in sensitivity, specificity and accuracy are observed in the machine learning models’ performance. The ROC graph shown for refined dataset (
Figure 2B and
Figure 3B) has a greater AUC than the unrefined dataset (
Figure 2A and
Figure 3A), indicating that enforcement of the selection criterion greatly improved the models’ class separation capacity between positive and negative class points. Between LR and CNN, CNN-SEL’s AUC (0.9996) is greater than LR-SEL’s AUC (0.99897), and CNN-SEL has an overall higher average normal sample accuracy level than LR-SEL (96.71% > 93.17%); thus, it is evident that CNN performed better at classifying the SCD class. Implementation of the search criterion is critical for the realization of an accurate depiction of a heterogeneous cytological blood sample for clinical settings.
While it is possible to produce classification decisions by simply evaluating the absolute number of critically deformed cells in any unknown patient sample, it would be difficult to analyze a sample with a large number of cells with this approach in real-time clinical settings. Out of the 471,792 sickle cells analyzed, only 18,925 cells matched our critically sickled cell criterion, equating to less than 5% of overall population. To ensure that quick, accurate classification decisions can be provided in clinical care settings, where a priority is placed on receiving quick, actionable results, it would be likely that smaller sample sizes such as ~5000 cells are used for analysis. Under such circumstances, deep learning models are superior to simply using a threshold to analyze a large population dataset approach because they can provide rapid decisions with much smaller sample sizes. Despite the fact that regular thresholding deemed that less than 5% of the population is critically sickled, deep learning found more than 25% of the cell population had sufficient differences to be identified as diseased. Deep learning potentially has the capacity to make a more nuanced decision than regular thresholding. Cells exhibiting slightly less severe yet still sickled deformation can be captured by the deep learning model, while traditional thresholding would fail to do so.
Both LR and CNN produced extremely high accuracy rates in identifying healthy subject samples yet lower counts of critically sickled cells in a SCD patient sample. Although only 25–29% of SCD patient cells were identified as SCD, this is within the expectation that not all RBCs in a given sample may be critically sickled and that only a portion of the RBCs are sufficiently altered to appear distinct from healthy RBCs. Since the deep learning network showed the worst performance on Sample D, resulting in 92.62% accuracy, we can suggest the prediction of 8% diseased cell count as the lower bound and 25% as the upper bound for a decision model. For patient samples which the deep learning model inferred to have more than 8% critically sickled cells, the decision model would recognize the entire sample as SCD. Potentially, the sample-to-sample variation in percentages (8–25%) could be used as a metric to evaluate the overall disease severity of a given patient and possibly provide information that could be used to predict sickling crisis and response to therapeutic treatments.
While SCD individuals are homozygous for HbS, SCT has a heterozygous genotype and is often considered a benign condition, with morbidity and mortality rates similar to the general population [
21]. We predict that the diseased cell count for SCT individuals would present at less than 8%. Future work should be focused on incorporating SCT data into the selective training algorithm. A side-by-side comparison of morphological parameter histograms may aid in the development of a set of additional selection criteria for differentiating SCT individuals from healthy subjects and SCD patients. At the other end of the spectrum, we would predict that some SCD individuals will have diseased cell counts above 25%. Patients with different SCD severity may exhibit different levels of diseased cell count. Additional SCD patient data may help fill the upper gaps in the current decision model.
One limitation of our approach is that other types of morphology-altering blood diseases have not been considered in this study. Previous studies have demonstrated QPM as a useful tool for detecting RBC anomalies such as changes in sphericity due to storage and water content changes due to mechanical compression [
22,
23]. The selective training methods presented in this paper can further enhance QPM’s sensitivity to RBC morphology changes. With further model development, the HC modality may potentially be used to detect blood disorders that cause RBC deformation, such as hereditary spherocytosis and beta thalassemia.