1. Introduction
Attention deficit and hyperactivity disorder (ADHD) is a mixed behavioral disorder that exhibits symptoms, such as carelessness (distraction) and hyperactivity–impulsivity and has genetic, neurological, and psychosocial associations [
1]. Carelessness results in problems, such as not paying attention, not following rules or instructions, or not completing tasks. In addition, hyperactivity–impulsivity results in problems of constantly moving or speaking.
ADHD typically appears before the age of 7, causing behavioral issues and poor academic performance in school-age children, as well as long-term impairment of academic, social, and emotional functions. Children with ADHD are so common that it is reported that there are 1–2 children in a class, but the prevalence varies significantly depending on the expert making the diagnosis. The prevalence rate varies significantly from 4% to 13% depending on the diagnosis criteria, information sources, and disability criteria. This is because the evaluation of hyperactivity and impulsiveness is typically performed differently depending on the observer and environment because the evaluation of symptoms is performed at the subjective observation level of parents, teachers, and clinicians [
2]. Existing ADHD diagnosis relies on the subjective judgment of clinicians in evaluating children’s activity, and analysis through objective behavioral indicators and data is difficult to generalize. In addition, because questionnaires from parents and teachers for children are used as basic data for evaluating children’s disabilities, objective evaluation is limited because of the distortion of the results of these questionnaires. For children, it is impossible to conduct a direct questionnaire test, and when a diagnosis is made in a hospital, psychological atrophy of the test child may occur. The problem with these phenomena is that the discrepancy between observers (parents, teachers, and clinicians) is a hindrance to the diagnosis of ADHD in children. Because children with ADHD lack social interaction skills, appropriate treatment, such as social skill training, should be urgently provided. However, there is a problem, that is, the early diagnosis of ADHD in children is difficult because of the above limitations.
To overcome these limitations, studies to quantitatively measure the behavioral characteristics of children with ADHD have been conducted. From 2014 to 2021, the Ministry of Trade, Industry, and Energy of Korea supported the project “development of a screening and training robot system with 95 percent behavioral recognition based on HRI for children with developmental disorder” to solve the abovementioned problems. The “ADHD screening system in the form of play using robots” developed through the project is a robot-based ADHD screening tool for children. In this system, the robot first delivers instructions to the subject and shows the path of the game to be played. Once the robot’s demonstration is over, the child (subject) should move along the path shown by the robot and react to random stimuli. While playing the game, the child’s joint data are acquired through the multiple kinect sensing system. In addition, through the collected data, indicators corresponding to the category of abnormal behavior were extracted, and finally, a classifier was developed that can screen for ADHD using these data as input to the multilayer perceptron (MLP).
The results of this study are significant in verifying that the “robot-led ADHD screening game” is a useful tool for ADHD screening and a system that can be incorporated into schools to help with the early diagnosis of ADHD in children. It is also significant in that it can quantitatively measure the behavior of children with ADHD.
The contributions of this study are as follows, and its schematic is shown in
Figure 1.
In this study, we propose a contactless sensor system to quantitatively measure children’s ADHD characteristics during a screening game using a robot and an ADHD classifier using the data acquired therefrom. The proposed sensor system uses multiple RGB-D sensors to acquire the child’s skeletal data and RGB-D data without occlusion while playing the game. Through this system, we were able to acquire the child’s body data without wearable sensors. Because of this, participants were able to move their bodies without the movement restrictions imposed by wearing the sensor;
In addition, we extracted the features of abnormal behavior predefined by clinicians from the acquired children’s data and developed an ADHD classifier using MLP;
Finally, we selected high-importance features among the predefined features. Then, a procedure was applied to improve the performance of feature extraction by applying the feedback verified by clinicians on the justification of the selected features. This feedback reflection of clinician verification has the advantage of improving the developed classifier’s performance using the collective intelligence of clinicians and preventing the developed classifier from being biased toward the opinion of a few clinicians.
The detailed structure of this paper is as follows.
Section 2 introduces the existing methods for diagnosing ADHD.
Section 3 introduces the materials used in this study and the methods proposed.
Section 4 describes the experimental results.
Section 5 presents a conclusion and discussion.
Section 6 presents future works. Finally,
Section 7 presents limitations of this study.
2. Related Works
Various types of sensors have been used to diagnose ADHD. Among these, the most used sensor is the inertial measurement unit (IMU). Alarifi and Alwadain attached accelerometers to the subject’s finger and wrist, respectively, and attempted to find the difference between the ADHD and normal groups [
3]. These studies revealed significant differences in the sensor data obtained from the ADHD and normal groups. However, because the number of subjects was small, it was difficult to generalize the results, and there was a limitation in that there was no quantitative result in classifying the ADHD and normal groups. Author, O’Mahony et al. [
4] attempted to develop an objective scale that can help diagnose ADHD by characterizing the movements of both the ADHD and non-ADHD groups and quantifying the characteristics. The authors used two IMUs consisting of a 3-axis accelerometer and a gyroscope, attached to the waist and ankle of the dominant leg. The children’s movements were continuously measured during a visit to a psychiatrist. A total of 668 different features were extracted from the acquired data in the time and frequency domains. By applying a support vector machine (SVM) model to the extracted data, an accuracy of up to 95% was obtained. Another author Munoz-Organero et al. [
5], experimented on 16 children with ADHD and a control group. All children wore IMU sensors on their wrists and ankles for 24 h. Following the data collection, two-dimensional acceleration images were generated. With the transformed images as input, children with ADHD and the control group were classified using a convolutional neural network (CNN) algorithm. As a result, the accuracy of ADHD screening was reported as 93.75%. Previous studies had a significant drawback, that is, children had to wear sensors for 24 h. In other words, there is a blind spot in the reliability of classification accuracy because the movement of the object may be unnatural due to the attached sensor. In addition, it is difficult to normalize the collected data because children’s life patterns differ.
In another research group, using a sensor called Actigraph, which measures not only the acceleration, but also the intensity of movements for joints, a study on ADHD diagnosis was conducted [
6,
7]. The authors Wood et al. obtained 88% ADHD screening diagnostic accuracy using the movement pattern of the wrist and ankle for children using Actigraph. Other authors, Miyahara et al., measured the movements of the wrist by wearing Actigraph on a child’s wrist for 2 h every day for a week, obtaining classification accuracy of 70%. In another research case, four 3-axis accelerometers were mounted on children on both wrists and ankles to acquire data while the children were in school. The data obtained through these sensors were used to classify children in each group using CNNs [
8,
9,
10,
11,
12].
Studies on ADHD screening based on the movements of specific joints have been conducted thus far. However, the movement of complex joints rather than specific joints may affect the performance of ADHD screening. In addition, sensors, which must be attached to the body, interfere with the movement of complex joints, creating an unnatural motion. Therefore, studies on ADHD screening have been conducted on the basis of the motion data of the entire body through motion capture. Authors Li et al. measured the movement intensity of the entire body using Microsoft’s Kinect system [
13]. The ADHD group and the control group consisted of 30 people each. The entire body movement of the children was measured using a depth sensor while performing a simple Go/No-Go task. Using the measured data, they performed ADHD screening and achieved an accuracy of 90%.
Min et al. [
14] conducted a study for ADHD screening using a robot, and this study was conducted as a follow-up to previous study. W, Lee et al. screened ADHD by inputting skeletal data of children acquired while playing a game using a robot into a deep learning model in [
15,
16]. Lee et al. screened for ADHD by using the acquired children’s skeletal data as input to a recurrent neural network [
15]. In addition, Lee et al. analyzed the effect of each stage of the ADHD screening game on ADHD screening performance using the skeletal data of the participants [
16].
As mentioned above, many studies have been conducted on ADHD screening using various sensors. In addition, most of the experiments are difficult for children to use continuously due to the inconvenience of having to attach a sensor for a long time or proceed with the experiment. Furthermore, the use of a depth camera without attaching any sensor to overcome this problem has a problem of poor accuracy. For these reasons, a system with high ADHD screening accuracy through non-contact measurements is required.
3. Materials and Methods
In this section, we introduce the sensor system used to collect data and the screening game using robots. In addition, we explain how to diagnose ADHD using data collected from children while playing a game led by a robot.
3.1. A Robot-Led Game for ADHD Screening
“A robot-led game for the screening of ADHD” is a system developed to achieve objective ADHD diagnosis by solving the problem of subjectivities of clinicians’ diagnoses.
In this system, the robot first delivers instructions to the subject and shows the path of the game to be played. Following the robot’s demonstration, the child should move along the path shown by the robot and react to random stimuli. There are two main types of stimuli: “one of the friends” or “witch”. When the “one of the friends” stimulus is presented, the child must raise his/her hand over his/her head and wave, and when the “witch” stimulus is presented, the child must squat down. At this time, children’s movements and the type and result of their reactions are measured and recorded as a feature, and these data are utilized to classify the ADHD disorders of children. A player of this game performs a total of five games, including the practice game. First, in the “ADHD screening system in the form of a play using a robot”, the robot moves through a random path among 9 areas marked with numbers 1–9 on the game board. While the robot demonstrates the way, the children have to memorize the robot’s path. Then, the children must follow the path they remember. In addition, the stimuli described above are presented while the children move along the path. The children must wave their hands or sit down, depending on the type of character (friends or a witch) that appears while moving along the path on the game board. This is a type of stimulus that measures a child’s concentration and attention. Children participating in the game should focus on the path of the robot or the stimuli presented during the game. During the game, a child’s body information is collected using multiple Microsoft’s Kinect Azure sensors. At this time, the collected data are the child’s skeletal data and RGB image data.
3.2. Participants and Data Acquisition Environment
In this study, participants were recruited from grades 1–6 of an elementary school. The HA group in
Table 1 refers to the group of the first site operated by Hanyang University Hospital, and the EA group refers to the group of the first site operated by Ewha Womans University. Then, group names were designated in alphabetical order according to the site operation plan, but there were groups that could not be operated and groups whose operating order was changed due to the influence of COVID-19. Three sessions of data collection were conducted, and the sites that were the subjects of each session are shown in
Table 1.
In order to classify the child’s ADHD symptoms and accompanying diseases, it was checked whether the child had undergone a mental health medical examination within the last 6 months, and whether the child had been diagnosed by a psychiatrist, and the medications taken according to the diagnosis were classified to confirm the accompanying diseases. The children who did not become eligible were first selected through the structured interview method and the main caregiver’s questionnaire, and the research subjects were classified. Children who participated in this study also used existing ADHD screening diagnostic tools, such as the child behavior check list (CBCL), Korean ADHD rating scale (K-ARS), and the comprehensive attention test.
Five Microsoft Kinect Azure sensors were used as a sensing system to collect children’s data during the game, and the game board was projected on the floor using a beam projector. The data acquisition environment is shown in
Figure 2. In the proposed game environment, the skeletal data of the target subjects without occlusion errors could be obtained in the game. The projector and the sensing system were driven using a single personal computer (PC), and the detailed PC specifications are as follows.: Intel i-10900 CPU, 64 GB RAM, 1 TB SSD, and 2 GTX 2080 Ti GPUs. The screening game is played in a space of 5 m × 3 m, and 1–3 children can participate in the game. In addition, five Kinect Azure sensors were installed at a height of 2 m from the ground to acquire the joint data of children in this space.
3.3. Multi-Subject-Tracking System
Multiple RGB-D sensors were utilized to track multiple subjects in a large area. Kinect Azure is a recently developed RGB-D sensor fabricated by Microsoft. By using the Body Tracking SDK, it is possible to track human information in the form of skeletal data consisting of 32 joints. In addition, the skeleton-tracking algorithm operates based on deep learning and operates on a GPU to track the skeleton in real time. By using the system, the data shape of the skeleton was captured in real time and used to diagnose ADHD as raw data. In the system’s descriptions, to track individual skeletal data of each subject, a merging algorithm was used to merge multiple skeletal data of each subject obtained by each sensor. Furthermore, to track a specific subject who was participating in the game, an identification algorithm was proposed. The overview of the multi-subject-tracking system is depicted in
Figure 3.
3.3.1. Capture Merged Skeletal Data
First, each coordinate system of sensors was calibrated to a global coordinate system. The system was installed at a different site for the construction of the dataset. The calibration process was performed according to the process presented in [
17]. The centroid trajectory of the sphere object was recorded using point cloud data (PCD) and the RANSCAC algorithm. In this process, the sphere object was first filtered by specific color information. However, there was noise interference during the estimation of the sphere object’s centroid in different environments. To filter out this noise, a masking process that filters out predefined noise and region-limited filtering that removes PCD outside the limited region were adopted. Following filtering, a trajectory consisting of the sphere object’s centroid position was constructed for each sensor. Then, a transformation matrix consisting of rotation and translation information was calculated by singular value decomposition. Then, all coordinate systems of the subordinate sensors were calibrated to master sensors using the calculated transformation matrix. Lastly, all coordinate systems were calibrated to a global coordinate system set by a user using the ARUCO marker corner position recognized by the marker detection algorithm.
Following calibration, the skeletal data in the same coordinate system could be captured. Then, multiple skeletal data tracked by multiple sensors in real time were merged by the algorithm proposed in our previous study [
17]. First, the misoriented joints were corrected using reference positions. These reference positions were joints with high-confidence values provided by the skeleton-tracking SDK. The misoriented joints were arranged by a simple distance comparison procedure. Then, the merging algorithm could merge these multiple joint positions accurately by filtering noise candidates. The density-based spatial clustering of applications with noise was used for filtering. The reference position, which is the average of the candidates with high-confidence values and the previous position of the corresponding joint, was used during the clustering process to give high-confidence candidates more weight in the merging process. Lastly, the Kalman filter was used to smooth the tremble errors of the joint’s movement. As a result of the algorithm, the position error was lower than 52 mm for all joints when 4 sensors were used. By using the merging algorithm, the accurate skeletal data of each subject could be captured in real time.
3.3.2. Human Identification
The purpose of this study was the screening of ADHD in each participant. In this regard, after capturing the skeletal data of multiple subjects using the merging algorithm, the skeletal data of each participant were identified to track each participant separately. The identification process proposed in this study was based on the color information of the clothing worn by each participant. To facilitate real-time tracking, a procedure for creating reference ID information based on color information by registering subjects participating in the content in advance has been added. In the registration procedure, the object is ordered to perform a T-pose at a predefined location, and the registration procedure is performed by the operator. In addition, the operator can predetermine the number of subjects participating in the content. First, a color image captured by each sensor is converted into an HSV (hue, saturation, value) domain to store the color information of each object. Then, a histogram of the accumulated values of the hue channel corresponding to humans from the HSV images extracted from each sensor is calculated and stored as reference data for the ID of a specific object. Pixels corresponding to humans are extracted from the original color image using the body-index image extracted from Body Tracking SDK, as shown in
Figure 4. The procedure for extracting the hue value of humans is also used in the real-time tracking process.
In the tracking process, the multiple tracking information consisting of hue-histogram information and skeletal data was captured in real time. There could be multiple tracking information depending on the number of sensors and subjects. For example, 10 skeleton candidates could be extracted when 5 sensors were used to track two subjects, and all sensors could capture all subjects. Each skeleton candidate was first clustered on the basis of the position of the pelvis joint. The clustering method was the Hungarian-bipolar matching (HBM) algorithm proposed in [
18]. The HBM algorithm assigns source classes to target classes using minimum cost matching. Furthermore, the HBM algorithm has a processing speed of O
, and the proposed identification procedure could be run in real time on the CPU. The cost matrix of HBM was calculated by the Euclidean distance between the pelvis positions of each skeleton candidate. In the positional clustering process, the number of skeletons recognized by sensors may differ from one another. In other words, for each sensor, the number of subjects recognized in the viewing area of the sensor could be different. In addition, the HBM algorithm can only assign the source class to the target when the number of sources and targets is equal. Therefore, dummy classes are added to make the number of source and target classes equal. At this time, a dummy class has a large position value (for example 1000, 1000, 100 for the x, y, and z coordinates). Then, after performing minimum cost matching using HBM, the source or target class assigned to the dummy class is added as a new class for the next matching.
Following clustering based on the skeleton position, there could be multiple clusters consisting of multiple skeleton candidates according to the number of subjects. The multiple color information from each sensor was accumulated before the skeleton merging and identification process. A histogram was calculated using the accumulated hue values for a single cluster, followed by single tracking information consisting of multiple skeleton candidates, accumulated hue values, and a single histogram. The skeleton candidates are used to generate merged skeletal data, and a single histogram is used to identify the ID of a single subject. The multiple histogram data of clustered tracking information are identified and compared with the reference ID information of participants predetermined during the registration procedure, as shown in
Figure 5.
The same as the position clustering method, the HBM algorithm was used as a matching method. The source group is composed of multiple histogram data that are captured in real time, and the target group is the multiple histogram data of reference ID information. Then, the cost matrix is calculated on the basis of the Bhattacharyya distance (BD) [
19]. The BD is widely used in signal processing, image processing, and pattern recognition research, as an index for measuring the distance between two probability distributions. The BD is also used to measure the difference (or similarity) between two different histograms
and
and can be defined, as in Equation (1).
where
denotes the histogram of the first probability distribution,
denotes the histogram of the second probability distribution,
represents the number of data probability distributions, and
. The BD can obtain each probability distribution without being affected by being co-rated and can express their similarity with the obtained probability distribution. This method compares the similarity of the distance between two probability distributions by dividing the histogram by the total size of the histogram to make the sum equal to 1 and then transforming it into a probability distribution to determine the similarity of the distribution of the histogram.
Several variables could disturb the collection of data in various environments. For example, the participants could move out, unregistered people could enter the content area, or registered subjects do not perform the content. The issue with participants moving out was not severe because the tracking system could re-track when the subject re-enters the content area. Similarly, the existence of a drop frame that means miss-tracking for single or multiple subjects was not severe. However, other cases could impair the data. For instance, if unregistered subjects are not filtered, the data could not consist of target subjects, and the result of ADHD screening is significantly influenced. Furthermore, the HBM algorithm can only operate when the number of sources and targets is equal. To address these issues, for robust data collection, some tricks have been proposed in the identification process. First, unregistered subjects are filtered with the threshold of the BD in the color information matching process. The threshold was set to 0.5 in every data collection in this study. Second, when an unregistered subject who is wearing clothing of similar color to a registered subject enters the content area, the number of target classes is increased to be equal to the number of subjects in the content area. In other words, the amount of dummy data equal to the difference between in-subject and target is appended to target classes. Then, the cost value between each source and target dummy class is set as the largest number, indicating the highest cost. Following the construction of the cost matrix with the proposed trick, the non-target (unregistered subjects) could be filtered by excluding the source assigned with a dummy of target classes. Third, when subjects do not perform the content, meaning that the number of source classes is smaller than the number of targets, the source classes are extended to be equal to the number of targets similar to the extension of target classes. Then, the target class assigned with the source class of the dummy could be filtered as a non-player subject.
Figure 6 shows an example of identifying three subjects and a robot using the explained methods.
For robust data collection, every subject was ordered to wear a jacket with a specific color (red, green, blue). The supplied jackets are distinguished by their different colors. In addition, the BD between colors was greater than 0.5. The distance between red and blue was 0.703, and that between red and green was 0.603. The distance between green and blue was 0.638. This suggestion prevents the occurrence of mismatched identification when subjects wear clothing of similar colors. In other words, this constraint makes the proposed identification method more reliable and robust.
3.4. Feature Extraction for Predefined Measurement Items and Diagnosis of ADHD
In order to confirm the degree of ADHD symptoms, M.I.N.I (The mini international neuropsychiatric interview), a structured interview, was conducted to find out the overall classification of the child’s psychiatric disorder, personal information search for the subject, and the developmental history of symptoms and behaviors. In addition, K-ARS (Korean VERSION OF the ADHD rating scale-IV) and KOREAN version of CBCL (
child behavior checklist) surveys were conducted to identify overall abnormal behaviors related to ADHD symptoms in children. In addition, as an individual test of children, the comprehensive attention test (CAT. Korean version of the continuous performance test) was conducted, to evaluate the child’s attention on continuously given tasks, and the overall IQ was confirmed. In order to objectively index the ADHD symptoms of DSM-5 and children’s abnormal behaviors extracted during clinical application, this system is composed of two clinical experts in mental health and psychotherapy, and the abnormal behaviors that can screen ADHD symptoms. The final selected abnormal behavior indicators were compared with the results of questionnaires (K-ARS, CBCL) related to the diagnosis of ADHD to parents of children participating in the study, and the discrimination power of the screening diagnosis system classifier was confirmed. At this time, two specialists who had completed a master’s and doctoral degree in psychiatry at a medical school and had more than 15 years of experience participated in checking the scale. There is a total of 15 abnormal behavior indicators with 4 categories that can measure the attention, activity level, and impulsivity of children who are playing the screening diagnostic game with the robot. These 4 categories referred to the previous study [
14,
20] to screen ADHD. The authors of these studies tried to objectively measure the movements of the participants through the robot-assisted kinematic measure for ADHD (RAKMA). In these studies, 35 ADHD and 50 normal subjects participated in the experiment from 2016 to 2017, and ADHD at risk was not considered. In these studies, body movements, stimulus-response performances, step-accuracy on game board, reaction time, movement distance and time were measured to obtain inattention, impulsivity with motor activity, and hyperactivity.
Table 2 shows the screening diagnostic abnormal behavior indicators developed for the ADHD screening of children. Indicators of previous studies were designed based on the DSM-5, and this study inherited this design and upgraded the indicators to screen for ADHD (Indicator A2, A3, B1, B2, B3 and B5 in
Table 2).
As shown in
Table 2, there are 15 abnormal behavior indicators measured for each game level, and the number of features measured while playing a game of 5 levels is 75. The “wait” state, which is the basis for extracting each indicator, is the indicator measured while the robot is explaining the game, and the “on game” state is the state in which the robot has finished the explanation, and the child has to play the game.
3.4.1. Indicator A1—Body Movement during Wait
The first abnormal behavior to detect is “body movement during wait”. The child is asked to pay attention while the robot is demonstrating. However, because children with ADHD often have poor concentration, they are typically more unable to concentrate than normal children. Therefore, this factor indicates a child’s inattention or hyperactivity impulsive behavior. This measurement is performed in the “wait” state of the child, and the total measurement time is divided into five-second intervals, and the ratio of the section in which movement occurs to the total section is recorded. At this time, the child’s movement to be recorded includes not only body joints, such as arms and shoulders, but also the movement of the gaze. During wait, body movements are measured in the world coordinate system, and head, shoulder, and wrist movements are measured by Equations (2) to (4), respectively. One section is made in units of 5 s, and when the motion occurs in each section, the section is defined as 1, and when there is no motion, it is defined as 0. Then, the A1 indicator is extracted by calculating the ratio of the section in which motion occurs in the entire section. The extraction formula of the finally measured A1 is expressed as Equation (5). At this time, the threshold of Equation (2) is 30 degrees, and the threshold of Equations (3) and (4) is 200 mm, respectively.
3.4.2. Indicator A2—Take a Seat during Wait
The second abnormal behavior is “take a seat during wait”. This indicator measures the number of times a child sits down during wait, and the extraction formula is as follows, Equation (6).
3.4.3. Indicator A3—Leave the Waiting Area during Wait
The third abnormal behavior is “leave the waiting area during wait”. The child is asked to stay in the waiting area while the robot demonstrates. This measurement indicator, similar to A1, occurs in the “wait” state of the child, and the total number of times the child leaves the waiting area is calculated, and the extraction formula is as follows, Equation (7).
3.4.4. Indicator B1—Enter the Game Board before the Start Instruction
The fourth abnormal behavior indicator is “enter the game board before the start instruction”. The robot instructs the child to start the game, but before that, the child should not enter the game board. In other words, the child must play the game after receiving the robot’s instruction to start the game. However, impatient children occasionally enter the game board before the robot tells them to start the game. The third problem behavior counts the number of times the child enters the game board before receiving a start instruction from the robot. Even if the child moves from the game board to the waiting area and then re-enters the game board, the measuring count is increased only once.
3.4.5. Indicator B2—Not Playing the Game after the Start Instruction
The fifth abnormal behavior to be measured is “not playing the game after the start instruction”. When the child receives the robot’s instruction to start the game, the child must enter the game board and move along the path. However, some children do not play the game even after the robot’s instruction to start the game. In this case, the indicator of “not playing the game after the start instruction” is measured once.
3.4.6. Indicator B3—The Amount of Time from the Instruction to Start until the Child Begins the Game
The sixth indicator of behavior to be measured is “the amount of time from the instruction to start until the child begins the game”. This indicator indicates the time it takes for a child to enter the game board for the first time after receiving an instruction to start the game.
3.4.7. Indicator B4—The Accuracy of the Child’s Path Movement
The seventh indicator is “the accuracy of the child’s following path”. This factor is appropriate for evaluating working memory deficit, one of the characteristics of ADHD. To measure this factor, the true/false for each path of the child relative to the path presented by the robot is recorded. When measuring this factor, not only the true/false for each path, but also the success/failure for the entire path are measured. In detail, indicator B4 has four types of subclasses: B4_1 to B4_4. The indicators B4_1 and B4_3 refer to the result of whether the subject moved well along the suggested path in each step. For each path, if the path suggested by the robot and the path traveled by the child are the same, B4_1 is measured. However, if the paths are different, B4_3 is measured. The details of B4_1 and B4_3 are shown in
Figure 7.
Each step in the child’s path is finally determined by the path, where both feet are located on the number-marked area on the game board. Here, when a foot travels along a different path and then returns, it is defined as a path correction. In addition, if a child steps on the incorrect path before moving along the correct path and then returns, it is measured as B4_2. The details of B4_2 are presented in
Figure 8.
Similarly, when the child’s foot moves along the correct path before moving along the incorrect path and then returns, it is measured B4_4. The details of B4_4 are presented in
Figure 9.
The subclasses of B4 are measured as the ratio of each subclass to the total path, as expressed in Equation (8).
3.4.8. Indicator B5—Move before the Movement Confirmation Sound
As the child moves through each number-marked area on the game board, a movement completion sound is produced. The child must move after the movement sound has completely stopped, and if it is violated, it is measured as a B5 indicator.
3.4.9. Indicator C1 to C4—Child’s Response to Stimuli
The indicators C1–C4 of the behavior to be measured is “child’s response to stimuli”. While the child is moving along the path on the game board, the robot presents a stimulus to the child. As mentioned before, the child should exhibit an appropriate reaction according to the types of stimuli. Furthermore, the child should respond to a stimulus within 5 s after it is presented. According to the reaction, the behavior is classified as true/false. If the child reacted after 5 s, the result is stored as “unresponsive–false”. The types of stimuli are recorded in the same way as the subclass of B4. C1–C4 represent “correct response”, “ incorrect response before correct response”, “incorrect response”, and “correct response before incorrect response”, respectively. In addition, the children’s responses were measured using Equations (9) and (10).
3.4.10. Indicator C5—Time Taken to Respond to a Stimulus
The indicator C5 is “time taken to respond to a stimulus”. This factor indicates the time it takes for a child to react to a stimulus presented by a robot. This factor is measured in units of 1 ms for each time a stimulus is presented.
3.4.11. Indicators D1 and D2—Total Execution Time and Distance Traveled
The indicators D1 and D2 are the total execution time and total distance while the child played the game, respectively. For these indicators, the execution time and travel distance for the “wait” and “in-game” states are acquired for each level, and the sum of them is also calculated and stored as a factor.
3.5. A Feature Selection Method for Measuring the Rank of Features
In this study, a feature selection method was used to measure the ranking of features helpful in the ADHD screening of children. The wrapper method, which is a feature selection method, is based on tree-based and linear models providing feature importance. Although this method is time-consuming, it exhibits comparable performance to the linear model. Because this study aims to understand the importance of each feature, the wrapper method was adopted as the feature selection algorithm. The wrapper method can overcome the disadvantage of the filtering method, which does not consider the relationship between variables. In detail, the subsets of all combinations of features are first created. A linear model is then trained using each subset, and the performance for each subset is recorded. As a result, the importance of each feature is decided by measuring the training performance of the subset, including each feature. To implement the wrapper method, the recursive feature elimination (RFE) method was adopted. In addition, decision tree and linear regression classifiers were used as selection algorithms. A schematic of the wrapper method is shown in
Figure 10.
When all features are set as the input data of RFE and the expert’s ADHD diagnosis result is set as label data, the ranking of the features for fitting the expert’s defined ADHD diagnosis result is extracted, as shown in
Figure 11.
Through this method, the importance of features could be ranked, and the result of feature selection is described in
Table 3. As a result, the features that had the greatest influence on ADHD screening among all features were the “attitude” factor, total performance time, and total travel distance. This shows that the motoric symptoms of children with ADHD are more important than other factors for ADHD screening.
3.6. Feedback Reflection Architecture for Reflecting Collective Intelligence by Clinicians
This procedure requires that a certain number of clinicians verify the correctness and incorrectness of the result of each measured indicator and reflect the feedback. Consequentially, this step aims to avoid biasing the results by a few clinicians and finally generalizes the ADHD classifier. The feedback reflection structure of clinicians is shown in
Figure 12, and the details are as follows.
Acquiring video data while playing a game;
Plotting the measured abnormal behavior features in the video;
Voting true/false by clinicians for the measured abnormal behavior;
Reflecting the clinician’s voting results in the feature extraction algorithm.
Among the features ranked in highest order, all factors, except A1, can be measured objectively. However, the “body movement during wait” indicator, A1, depends on the setting of the threshold, which is the basis for the measurement. Therefore, the verification process by the clinicians who designed this robot game and the indicators of abnormal behavior was performed to interpret the result of the measurement algorithm. Clinicians refer to the video data and indicate in a separate document whether or not the A1 indicator was properly measured for each section. Then, the threshold is modified by comparing the total amount of movement of the participants measured in each section with the clinician’s diagnosis. Four clinicians participated in this process, and the verification was performed using the labeling tables and videos, as described in the third progress in
Figure 12.
Three feedbacks are obtained from the video-labeling review by the expert who designed the game. In detail, the content is shown in
Table 4 and
Figure 13.
In addition, the indicator extraction algorithm reflecting each feedback was modified, as shown in
Table 5.
3.7. Deep Neural Networks Model for Performance Validation
The MLP was used as a performance verification model of ADHD screening using the features extracted in the previous process. The MLP consists of sequentially attached perceptron layers. The neurons in the two adjacent layers are fully connected. As a loss function, the cross-entropy loss was adopted. In this study, experiments were conducted on the three research questions mentioned in
Section 4, the experimental results using the MLP model.
Figure 14 shows a schematic of the MLP used in this study. The optimized hyperparameters of the model were derived using Keras Tuner, and the proposed MLP model consists of two hidden layers (filter size: 256), and the dropout ratio was 0.4 in the training phase.
5. Discussion and Conclusions
In this section, we summarize the results of the experiments conducted in
Section 4.
ADHD is a mixed behavioral disorder that exhibits symptoms, such as carelessness (distraction) and hyperactivity–impulsivity and has genetic, neurological, and psychosocial associations. ADHD is also a disorder that typically appears at a young age and persists into adolescence and adulthood. Therefore, this disorder is a disease that requires early diagnosis and treatment. Generally, ADHD is diagnosed by a clinician by synthesizing the results of an interview with parents of children, assessment scale tests, and tools that can objectively measure ADHD symptoms. However, parents or teachers may determine that the symptoms of ADHD in children are simply distracting or ignorable symptoms. Furthermore, because, to date, ADHD diagnosis is mostly performed by parents, early diagnosis of ADHD is very difficult. Therefore, in this study, a new ADHD screening diagnostic tool using a robot based on a game that can measure abnormal behavior quantitatively was proposed.
First, we developed a non-contact sensing system that can quantitatively study the movements of children with ADHD. This system was introduced in detail in
Section 3.3, and when using five sensors, it can acquire the subject’s skeletal data within an error of up to 52 mm.
In addition, in this study, the effectiveness of the ADHD screening diagnostic tool was verified. First, to improve the performance of the ADHD screening diagnostic tool, video-labeling verification was performed on the main features measured while a child played a robot-assisted game. This experiment was validated on the data of 828 children, and the procedure improved the detection accuracy from 92.59% to 94.81%.
In the second experiment, the consistency of the data collected in different environments was verified. This was performed using data obtained from seven groups; the maximum accuracy was 97.06%, and the minimum accuracy was 92.27%. By verifying that the proposed robot-assisted ADHD screening game can selectively diagnose children for ADHD with an accuracy of at least 92%, the applicability of the proposed tool as an ADHD screening diagnostic tool was verified.
In addition, the performance of the proposed tool was compared with other existing tools for ADHD diagnostic screening. The accuracy of screening using the robot game was almost 95%, and the classification results for the ADHD and ADHD risk groups also have high sensitivity. However, the classification results using the existing ADHD screening tools showed very low sensitivity for the ADHD and ADHD risk groups. This shows that the existing screening diagnostic tools cannot classify the ADHD or ADHD risk group accurately. In addition, the results showed that the spontaneity of subjects participating in ADHD screening was essential. From the results of the existing screening tools for the HC group, which showed results similar to those of a game using a robot of the HC group, parents voluntarily participated to determine whether their children had ADHD. However, because the other groups randomly recruited participants from an elementary school and had their parents use an existing screening diagnostic tool, the reliability of the results was very low. In other words, the existing screening tools produce significantly varying results depending on how actively parents participate in the test, which means that the existing tools have no choice but to rely on the voluntary participation of parents.
Finally, we analyzed the meaning of the results shown in each of the extracted indicators. Using Equation (1), we analyzed the BD between the ADHD, ADHD risk, and normal groups for each indicator. The results are shown in
Table 13. As mentioned before, indicator A1 indicates the extent of the child’s movement while the robot was explaining and demonstrating. As shown in
Table 13, the BD of indicator A1 between the groups is 0.6325, 0.8134, and 0.813, respectively. This is large compared with the other indicators. Therefore, this result means that the similarity of indicator A1 appearing in each class is low. Furthermore, the BD between the normal and ADHD risk groups was 0.6325, and that between the other groups was 0.81 or more. This suggests that the data distribution of the ADHD group is far from that of the other data groups, so it can be more properly classified.
Indicators A2 and A3 mean “sitting down during wait” and “leaving the waiting area during wait”, respectively. In addition, indicators B1 and B2 mean “entering the game board before the start instruction” and “not playing the game after the start instruction”, respectively. Consequently, indicators A2, A3, B1, and B2 are expressed in terms of the number of times. The BD of A2 and A3 was 0.1783, 0.3813, and 0.4157 and 0.3627, 0.6184, and 0.6240, for each group, respectively. In addition, The BD of indicators B1 and B2 was 0.1273, 0.2135, and 0.2823 and 0.0369, 0.0752, and 0.0698, for each group, respectively. This means that the similarity between the groups is higher than the preceding indicator A1.
In the case of B3, the BD were 0.2217, 0.22, and 0.1637 for each group, respectively. Therefore, the B3 index suggests that each data group cannot be classified better by B3 than by indicator A1. In addition, the BD of indicator B5 is 0.1015, 0.1417, and 0.1504. This result also means that the distribution of indicator B5 for each group is very similar, suggesting that there is no significant difference between the data groups compared with A1. C1–C4 are indicators indicating the child’s correct/incorrect response to stimuli. The BD of these indicators ranges from 0.0674 to 0.2787, which shows a high degree of similarity in data distribution compared with other indicators. In addition, indicator C5 is the time it takes for the child to respond to the stimulus, and the BD between the groups is 0.6173, 0.6612, and 0.6910, respectively. Lastly, D1 and D2 are the total time and distance traveled by the child playing the game, respectively. The BD of indicators D1 and D2 ranged from 0.4546 to 0.6658 and 0.3470 to 0.5838, respectively. In conclusion, the similarity of each data group was low in the order of A1, C5, D1, D2, A2, and A3, which is similar to the feature importance results in
Table 3.
This study proved that the robot-led game can be a useful tool for ADHD screening of children. As described above, ADHD is a disorder in which early detection and treatment are critical. In addition, subjects’ voluntary participation is necessary to accurately diagnose ADHD. In this sense, the proposed game and sensing system has the advantage of being able to objectively measure the activity and task performance of a child who directly plays the game without relying on parental evaluation. If this system can be integrated into the education system, such as elementary schools, it will be a robust tool that can compensate for the shortcomings of the existing questionnaire-based screening method.