1. Introduction
Chest X-ray (CXR) imaging has been used for diagnostic imaging since the beginning, and despite the development of diagnostic imaging equipment such as computed tomography (CT) and magnetic resonance imaging (MRI), CXR imaging has still been used for screening for cardiovascular and pulmonary diseases. The cardiothoracic ratio (CTR) measurement using CXR images is an important index to evaluate heart diseases [
1,
2,
3]. CTR is usually calculated as the ratio of the largest transverse heart dimension to the largest transverse chest dimension in a CXR obtained from posterior irradiation. CXR with posterior irradiation can avoid the enlargement of the cardiac shadow caused by the proximity of the detector and heart, based on the geometry of the X-ray system. Lung water also works as an important index to assess heart and lung diseases and can be evaluated in lung water secondary to increased intravascular hydrostatic pressure due to cardiogenic edema resulting from heart failure, increased permeability of pulmonary microvessels due to acute lung injury or acute respiratory distress syndrome, and other causes [
4,
5]. Since the beginning of the coronavirus disease-2019 (COVID-19) pandemic, CXR imaging has played a crucial role in the evaluation of lung conditions. It is easily accessed, has a low radiation exposure as a primary screening, and a low cost to assess pneumonia, which is one of the typical symptoms of COVID-19 [
6,
7,
8]. The importance of the CXR examination in daily medicine is undoubtedly important. However, during CXR imaging, both CTR and the distribution of lung water change with the patient’s position. For example, if the patient is able to walk independently, CXR imaging is performed in the standing position; therefore, if lung water is detected, it is concentrated in the lower lung field due to the gravitational effect. However, in a patient with difficulty standing, such as a critically ill patient, CXR imaging is performed in the supine position, and therefore, if lung water is present, it is distributed over the entire dorsal side of the lungs. In other words, the patient’s position is an important factor in the accurate assessment of CXR images in clinical practice. Therefore, the patient’s position should be reported to readers immediately after CXR imaging.
However, most of these works depend on manual operations by radiological technologists. For example, technologists add a marker that shows the patient’s position, such as “standing”, “sitting”, and “lying”, to the free space in the CXR image immediately after CXR imaging. Even though these markers allow the image reader, the radiologist, and the physician in the medical department to identify the imaging status at a glance for a large number of taken clinical images, human errors may occur at some point during this process, and these errors may cause serious incidents by misunderstanding the patient’s position in clinical practice [
9]. For instance, imaging findings will be confused by a misunderstanding of the patient’s position. Moreover, CXR output with left-right reversal is due to the difference in the irradiation direction due to incorrect imaging parameters of the X-ray console between a standing position and another position. Despite the large efforts of radiological technologists, these are quite often encountered in clinical practice. Therefore, an automatic and highly accurate system without manual operation is desired.
Artificial intelligence (AI) has been applied for the automatic assessment of medical images in recent years [
10,
11]. Amount of image data is needed to construct a high-accuracy model in machine learning (ML) and deep learning (DL). From this view, CXR, which has been traditional and common, may be suitable for attempts to apply ML and DL methods [
12]. Meanwhile, Marcus G et al. argue that the difficulty of replacing human workers with AI is from the view of the core frameworks of human knowledge: time, space, causality, and basic knowledge of physical objects and humans and their interactions [
13]. Hence, the medical field has been trying to implement AI specialized for specific tasks. We planned to construct DL models to focus on the quality assurance (QA) of CXR.
The previous study reported achieving correction of radiographs using AI [
14,
15]. We hypothesized that a QA system of CXR could be developed using QA modes. The classification method can estimate the orientation of the CXR and the patient’s position. The regression method can assess the angle of the CXR. Therefore, we attempted to construct an automatic QA system combining the classification method and the regression method. We expected that the system would correct the orientation, angle, and left-right reversal of CXR and estimate the patient’s position. Finally, the QA system adds the marker demonstrating the patient’s position in the free space of the CXR. In other words, manual work by technologists is completely replaced by automation in the QA system. If the QA system offers high accuracy in the management of CXR, the QA system can perform one of the important roles in preventing serious medical accidents and can be one of the most effective techniques to improve the overall quality of medical procedures and contribute to patient safety. This study aimed to construct a DL-based QA system for CXR and assess its usefulness.
2. Materials and Methods
This study was conducted in accordance with the principles of the Declaration of Helsinki and was approved by the local Institutional Review Board of the Otaru General Hospital (04–022). All the patients or their families provided oral or written informed consent to participate in this study.
2.1. Concept of a QA System for CXR
The workflow outline of the QA system is shown in
Figure 1. The outline of the datasets is summarized in
Table 1. The QA system consisted of four models: (1) correction of orientation, (2) correction of angle, (3) correction of left–right reversal, and (4) judging the patient’s position. We used the regression method for the correction of the angle, and another correction was performed using the classification methods.
In this QA system, each image is programmed to undergo correction at each step according to the judgment of each DL model. Finally, the QA system adds the marker in the free space of the CXR to inform readers of the patient’s position according to the judgement of the DL model. Makers were added at the right upper corner of the images.
The QA system consisted of four steps. Images were corrected in their orientation, angle, and left–right reversal, then the patient’s position was estimated through the QA system. The QA system was constructed by combining the classification method and the regression method. QA, quality assurance.
2.2. Datasets and Preprocessing of Images
Chest-14 data [
16,
17,
18] and clinical images were used for this study. Only the patients’ positions were evaluated using the clinical data of Otaru General Hospital. Other datasets were constructed using Chest-14. To focus on adult patients, data of children aged <15 years from Chest-14 were carefully excluded. Inappropriate images were also excluded from Chest-14 based on the visual review by the four radiological technologists. All images were confirmed as having no subjects with left-right organ reversal by two radiological technologists. All images were converted into a 224-square matrix in Joint Photographic Experts Group (JPEG) format to apply CNNs.
2.3. Dataset for the Orientation Correction
Throughout the above processing, a total of 20,000 images were prepared for the orientation correction model. To avoid training similar images, all images were selected from the first examination of each patient from Chest-14. The images were divided into 16,000 training images and 4000 test images. The dataset included 1/4 of each of 0°, 90°, 180°, and 270° rotations. Half of the images were flipped horizontally.
2.4. Dataset for the Angle Correction
In the same way as the datasets for the correction of orientation, a total of 20,000 images were prepared from the first examination of each patient in Chest-14. Half of the images were flipped horizontally. All images were randomly rotated from −25 degrees to 25 degrees using the computed processing. The images were divided into 16,000 training images and 4000 testing images.
2.5. Dataset for the Left–Right Reversal Correction
A total of 20,000 images were prepared from the first examination of each patient in Chest-14. Half of the images were flipped horizontally with computed processing, and images were divided into 16,000 training images and 4000 testing images.
Horizontal flipping of chest radiographs can occur due to incorrect parameter settings on the X-ray console. For example, if a chest radiograph is taken in the sitting or lying position using the console’s parameters for the standing position, the resulting image may be flipped horizontally.
2.6. Dataset for the Judgment of the Patient’s Position
Clinical images obtained in Otaru General Hospital were utilized for the dataset for the judgment of the patient’s position. A total of 3000 images were prepared for each position: standing, sitting, and lying. A total of 2600 and 400 images were used for training and testing data, respectively. Thus, a total of 7800 images for training and 1200 images for testing data were prepared.
2.7. Dataset for Overall Test
The QA system was constructed by combining the models trained using the above datasets. For the assessment of the overall quality of the QA system, a total of 120 clinical images, including 40 images of each position, were prepared. A random left–right inversion process was applied to 30% of the total images, and all images were randomly rotated from 0 to −359 degrees. Therefore, the dataset contained a mixture of images in various states. We masked the makers informing patient’s position in the left upper space of CXR with a black cover in pre-processing of the clinical images.
2.8. Training for Creating Models
The software for building the DL method was developed with in-house MATLAB software (The Mathworks, Inc., Natick, MA, USA), and a desktop computer with an NVIDIA GeForce RTX 3080 graphics card (Nvidia Corporation, Santa Clara, CA, USA) was used. Three convolutional neural networks (CNNs) were used for image training, and they could be selected in MATLAB: VGG-16 (Visual Geometry Group) [
19], ResNet-50 (Residual Neural Network) [
20], and the original CNN. The original CNN was a simple CNN that was constructed by reducing the convolution layers in VGG-16. The architecture of the original CNN is shown in
Figure 2. The classification method was applied, except for the angle correction. The regression method was applied for angle correction. In the regression method, the final layer was replaced with the regression layer, with an output size of 1. The dropout layer was added in front of the regression layer. The optimizer was Adam.
The loss function was cross-entropy for the classification method and half-mean squared error for the regression method. Fivefold cross-validation was performed for all training. The following hyperparameters were used: maximum number of training epochs, 30; initial learning ratio, 0.0001; learning rate drop period, 5; and learning rate drop factor, 0.2. The early stopping method was used. The validation patience was 10 days, and the batch size was 16. In the classification method, the model with the highest area under the curve (AUC) was selected as the best model. In the regression method, the lowest mean absolute error (MAE) was selected as the best model.
2.9. Overall Test
We performed the overall test to evaluate the accuracy of our QA system. The QA system corrected the orientation, left-right reversal, and angle, then estimated the patient’s position. Finally, a marker was added to the free space in the upper left corner of the image according to the patient’s position. The image correction was confirmed with the visual evaluation by two radiological technologists who had 20 and 5 years of experience. Precision, recall, and overall accuracy in estimating the patient’s position were calculated. The mean processing time of the QA system was measured. The dataset included 120 clinical images as described in the above.
2.10. Statistical Analysis
All continuous variables are shown as means ± standard deviations, regardless of the datasets. The Steel–Dwass test was used to compare the performance of the CNN after the Kruskal–Wallis test. A p-value < 0.05 was considered statistically significant. R version 4.1.1 was used for all statistical analysis and figure creation.
4. Discussion
We developed a QA system for CXR imaging using DL methods. The orientation, angle, and left–right reversal of CXR were corrected completely within approximately 0.4 s. In addition, the patient’s position, such as standing, sitting, and lying, was estimated with 96% accuracy.
Fonseca A et al. achieved 99.4% accuracy for the orientation correction of pediatric CXR using a machine learning method [
14]. In contrast, we acquired 100% accuracy for the correction of orientation, angle, and left-right reversal of CXR. This result suggested the advantage of using CNNs for the correction of CXR. Hržić F et al. reported highly accurate correction of various parts of radiographs using CNNs. They achieved 99.3% accuracy using VGG-16 and 0.02 s of processing with the GPU [
15]. Meanwhile, we proposed a specific QA system with 100% geometric correction in CXR. This result showed that it might be better to construct the QA system according to each body part. The processing time will not become a large problem for routine clinical use because under 1 s is clearly faster than manual operation by technologists.
Totally, the main advantage of our study compared to the above studies is that we combined four simple models to construct the QA system. Moreover, we also estimated the patient’s position directly related to the imaging findings. We believed that our study could propose a novel, comprehensive QA system for CXR based on the DL method.
The constructed QA system combined classification and regression using CNN. ResNet-50 demonstrated better performance for classification, correct orientation, left-right, and patient position estimation. We have already confirmed good compatibility with classification in medical images and ResNet-50 in magnetic resonance images [
21]. The same trend was confirmed in this study.
On the contrary, the original CNN that had simple layers showed significantly higher performance for the regression method for correcting the angle of CXR images. We thought that the original CNN could avoid overtraining compared with other CNNs in the regression of angle correction. ResNet-50 might have too many deep layers to use the angle correction. The accuracy of the entire QA system can be assured by using the appropriate model for each task. In the detection of pneumonia in CXR using DL, Szeppi P et al. achieved extremely high accuracy, such as 97%, using modified VGG-16. This study strongly suggests that VGG-16-like CNNs are compatible with CXR and can exhibit high performance [
22]. They showed the efficacy of dropouts on CNN. Our CNNs may be able to be refined according to their method.
QA systems should be built into the X-ray console system. CXR is immediately corrected, and the patient’s position’s marker is assigned. Then, radiological technologists confirm the corrected CXR and send it to the picture archiving and communication system.
Recently, medical errors have been serious concerns in the clinical scene. In the United States, medical errors claim more lives in hospitals than motor vehicle accidents, breast cancer, or AIDS [
10]. According to the Swiss cheese model and Heinrich’s law, many minor accidents are hidden behind serious medical accidents. Therefore, the prevention of minor accidents is directly related to the prevention of major accidents.
In the circumstances, artificial intelligence has been employed to develop an automatic QA system in the medical field. Claessens M et al. summarized the current status of the QA system based on AI in radiotherapy. In the radiotherapy field, various technologies using AI have been applied. The examples are auto-segmentation, image registration, auto-planning, CT image generation, patient QA, and machine QA. They suggested the importance of the correct use of case-specific and routine QA in clinical practice [
23]. Chan MF et al. reported that machine learning approaches are explored, highlighting specific applications in machine and patient-specific quality assurance (QA). They introduced the clinical usefulness and limitations of machine learning in QA systems. They argued for the necessity of a sanity check and a second check before the clinical use of QA systems based on AI. Additionally, they suggested that we should address the limitations of both data and ML models. [
24] Thus, we believe that we should avoid total reliance on AI-based QA systems, especially in the field of radiography, where AI-based QA systems have not been developed in comparison with radiotherapy. In this sense, our QA system for CXR will work as a powerful tool to perform double-checks with radiological technologists.
A large amount of radiography has been taken in daily practice. Therefore, the QA system for radiographs is sure to make a significant contribution to daily clinical practice. However, many studies of radiographs using AI technology have been performed to diagnose using the classification method and to estimate the quantitative value using the regression method. We considered that these techniques should be applied to the construction of a QA system for radiographs. However, no study has reported an automatic QA system for CXR imaging to automatically correct the orientation and angle and estimate the patient’s position. Hence, we attempted to develop a QA system for CXR, which is the most common radiograph in clinical practice.
One of the reasons CXR was selected was that a large dataset such as Chest-14, which had over one hundred thousand CXR was available, and many clinical images were also stored on our institution’s server. To create a DL model with high accuracy, a large data set is necessary. In addition, the CXR has only one pattern, which the chest is in the center of the image, and the pattern of the image is extremely small compared to other parts, such as the extremities. We did not need to prepare images of many patterns. This fact works as an advantage in the preparation of datasets and the model training process. Therefore, CXR was suitable to attempt to construct a QA system based on AI as a novel study.
We excluded images with the same patients from Chest-14 to avoid training similar images. Many images became unusable, but this process was necessary to avoid overfitting specific images. It seems that the dataset had sufficient size for this task because complete correction was achieved in the orientation, angle, and left-right reversal. Chest-14 does not provide information relating detail patient’s position when taking XR, such as standing, sitting, or lying. Therefore, we used the clinical images of our institution as a dataset to train the estimation of the patient’s position, even though dataset was small compared with other datasets. As a result, we obtained 96% accuracy in the patient’s position estimation. We believe that the data set size was sufficient for this study.
Meanwhile, a few images were confused between the sitting position and the lying position in this system. This is because strict positioning management was not enforced. In addition, in both the sitting and lying positions, X-rays are irradiated from the front side of the patient. Therefore, CXR images in both positions may become very similar. From the clinical perspective, the degree of body raising is adjusted according to the patient’s respiratory status. We think that the blurring of the boundary between the sitting position and the lying position is, to some extent, unavoidable.
On the other hand, the standing position was completely classified. Thus, the direction of X-ray irradiation was fully distinguished. The direction of X-ray irradiation is directly related to CXR findings, such as the distribution of pleural effusions, air inclusions and the measurement of CTR. Therefore, an accurate estimation of the irradiation direction is an important factor in CXR management in clinical practice. Moreover, orientation, image angle, and left-right reversal were fully corrected by the QA system. The processing time is clearly faster than a manual operation. These facts proved the usefulness of the QA system.
This study involves several limitations. First, this study was performed at a single institution. Further investigation is desired to evaluate the generalization performance of the QA system. The larger datasets, including another institution’s clinical images, should be applied for model training. Moreover, we excluded images of children under 15 from the datasets. This processing may lead to a statistical bias compared with the actual clinical scene. Although this processing is essential to achieving a high-accuracy model for CXR in adult patients [
12].
Second, all clinical images were retrospectively prepared. Therefore, a black mask is present in the upper-right corner of the image to hide a previous marker corresponding to the patient’s position. This mask might work as a landmark to correct the orientation, angle, and left-right reversal. Further investigation using plain CXR imaging is warranted. Although, the correction model is created from Chest14 data that does not have a mask. Third, all evaluations were performed using a GPU machine. The portable X-ray console PC without a GPU may delay the processing time. The clinical trial should be attempted using a daily-use X-ray console system.
Third, this study aimed to construct a specific AI for the QA of CXR. Hence, this system cannot make fully alternative human work from the perspective of the difference between the AI model and human cognition [
13]. Still, furthermore, double-checking should be enforced by both the QA system and technologists.