1. Introduction
Government funding accessibility and involvement in the development of National Healthcare Systems represent a long process in history [
1]. Internationally, significant discoveries have become possible through the availability of this financial resource, leading to a life expectancy increase, equivalent to a growth of 2.44 years, from 29 years in 1800 to more than 70 years in 2019 [
2]. Furthermore, the senior population segment is expected to continue increasing in the near future. In [
3], a UN statistic depicts the international prediction of the evolution of the world population until the year 2100: More than 10 billion people, representing 60%, aged between 25 to 64, and more than 2 billion elders aged over 65.
As a result, delivering medications in the dosages advised by medical professionals becomes a critical requirement, with this sector becoming increasingly significant for the population segment with expected medical problems. Unfortunately, pharmaceutical chains across the world produce medications in standard weights, and the doses given to patients are not customized for everyone. Physicians frequently write prescriptions for half and three-quarters of medicine to achieve doses less than the smallest available quantity. Sometimes, purchasing a larger amount of medication available at a certain moment and splitting the pill can help patients save money and time finding or administrating their medications. Besides the cost-saving potential, tablet splitting has a number of advantages, including providing proper dosage in cases where slow dose titration, dose tapering, and dosage adjustment for geriatric and pediatric patients are necessary. However, splitting medications can be both expensive and harmful if carried out incorrectly.
The current paper proposes a flexible augmented reality-based solution for splitting pills for specific dose weight customization.
For aiding the manual pill division process, currently, a visible mark, a line, is engraved for some medicines, establishing a visible delimitation for each half.
Figure 1 [
4] shows an example—paracetamol.
Specific drug information can also be obtained by automatic detection. The identification of numbers in images is an extensively studied aspect in the literature; various data sets assist the development of detection algorithms based on neural networks, including the Modified National Institute of Standards and Technology database (MNIST), National Institute of Standards and Technology Special Dataset 19 (NIST SD19), and United States Postal Service (USPS) [
5]. MNIST provides 60,000 images for training detection algorithms, as well as an additional 10,000 images that for testing the effectiveness of the created algorithms. Compared to MNIST, the USPS database comprises just 9298 samples, reflecting automated captures of USPS envelopes. With over 800,000-character images gathered from over 3600 writers, NIST SD19 is the richest in terms of recorded images [
6]. Convolutional neural networks were used in dedicated literature with the aid of these data sets [
7]. The results of model training and testing were extremely good: The accuracy ranged from 95% to 99.45% when using a pattern recognition algorithm built with the pattern recognition tool (PRTool) [
8], and 99.45% when using a backpropagation method [
9]. Given the time required and the large number of images necessary to train neural networks, the optical character recognition algorithm was chosen for the current solution. The best results were obtained when using Android phones; the solution proposed in [
10] achieved 80% accuracy in drug information detection.
Given that not all solid pharmaceuticals can be divided [
11], deep convolutional networks and convolutional networks represent a solution for pill identification. Furthermore, depending on the algorithm requirements and performance, the detection method varies in data set usage—from hundreds of images [
12] to one image [
13]. As a result, the validating precision is substantially influenced—for a single image it rates between 11% and 46% [
13], while for 5284 images, 98% is achieved [
14]. For the current work, a faster technique was chosen, without involving a large number of images as the data set, and with a similar outcome. Based on the observation that the line engraved in some pills sustains the manual fragmentation process, this proposed novel solution involves, as a first step, the detection of this surface streak. Based on previously reported results, the performance of the Hough transformation was examined. In [
15], the accuracy obtained for land transport routes identification, using the Hough transform, was 95.7%, indicating that this approach could also be successfully applied for the current work.
As a second step, a modern tactic based on augmented reality (AR) was considered for increasing the user experience and support. The potential of augmented reality in healthcare was investigated. The use of augmented reality applications has been reported in the case of user safety assurance, as a complementary aiding support in healthcare. In [
16], a newly developed algorithm evaluated pedestrians’ attentiveness to their surroundings based on factors such as heart rate or cell phone usage. When an imminent hazard is identified, the pedestrian is informed via an augmented reality-based graphical interface. In [
17], research associated with the psychological monitoring of patients, the usage of augmented reality gaming mode encouraged movement, and, as a result, a decrease in stress levels was reported. Furthermore, in [
18], augmented reality features were successfully associated with oral surgery.
The augmented reality capabilities for telehealth-type applications have also become a reality in [
19]. An immersive augmented reality application was developed for a clinical consultation using iPad and Kinect sensors, that allowed the physician to easily explain different complex medical aspects using simulation visualization through AR functionalities. The authors appreciated the system as being a low-cost rational implementation example, based on in-house available devices. The opportunity to develop similar apps for Android operating systems remains to be explored.
Besides its key role as a health education resource [
20,
21,
22], AR could represent a breakthrough in the self-assessment of mental illnesses with a high-cost report efficiency, where a dire estimation evaluates that, every 40 s, a person dies of suicide. In a context in which the public resources available for diagnostics and help are underfinanced, in [
23], the authors presented an AR software tool for mental health information dissemination and self-assessment. The tool uses a license-based software development resource for the creation of AR-triggered images for information support and download. An interesting application is presented in [
24], where the context of AR tools provides new uncharted tracks for patient support in stressful situations. Based on Kinect sensors, body sensors, and mobile device usage, the application implements a relaxation service for controlled breathing techniques, and further, for stress management. Another area that the AR applications could represent a strong support in health care is dementia patient care and their daily life improvements [
25], concerning visual aids for objects (medicine to be taken), people identification, and speech commands used for reminders.
In conclusion, an augmented reality solution for assisting patients with their medication could add another defense line in safety assurance, for healthcare professionals and health monitored population.
To streamline the process of controlling the recommended doses to patients, a flexible approach for the accurate administration advised by an expert is developed. Applying augmented reality for pill division support, the process of delivering the recommended dosages to patients is enhanced and reinforced. The proposed solution offers multiple options regarding pill fragmentation. Given that a part of the system’s users could be elderly individuals with mobility issues, the flexible AR solution allows pill tracking and new surface creation using the Kanade–Lucas–Tomasi (KLT) algorithm [
26], a method that identifies feature displacements between images by computing the sum of squared intensity differences. This functionality was designed for persons suffering from illnesses with symptoms similar to Parkinson’s disease.
This paper is divided into four sections. The introduction underlines the novelty of the research, with an emphasis on augmented reality applicability in healthcare.
Section 2 provides a technical overview of the equipment utilized and the chosen techniques. The results are presented in
Section 3. The article concludes with an overview of the work conducted, and the outcomes achieved. Furthermore, the same section describes the authors’ anticipated future developments.
2. Materials and Methods
To fulfill the goals outlined in the first section, an adaptable solution using object recognition and tracking for augmented reality inspired by [
27] was developed and is presented next. Due to the vast number of established functions and enhanced capabilities in augmented reality application development, the MATLAB programming environment was chosen to create a proof of concept.
To accurately identify the drug weight values, the OCR tool in MATLAB was set to detect only digits [
28]. Optical character recognition, or OCR, represents a technology that is able to recognize text in images. The process of text recognition involves text region localization and text region verification [
29]. The mathematical algorithms that stand behind this technology are Otsu’s algorithm for the segmentation process, and the Hough transform method for skew detection [
30].
As presented in
Section 1, the Hough transformation accurately assesses whether the pill under consideration can be split [
31]. This method identifies the line once the original image has been converted to binary and the noise in the binarized image below a particular threshold has been eliminated by MATLAB’s bwareaopen method [
32]. Following several experiments, a 400-pixel threshold is established. If the algorithm finds at least four-line segments in the processed image of a pill, one can assume that the pill can be divided into multiple smaller parts.
The Hough transform can detect lines, circles, and other structures if their parametric equation is known. The Hough transform addresses this problem by grouping edge points into object candidates and by performing an explicit voting procedure over a set of parameterized image objects. Furthermore, the Hough transform can detect these structures under noise and partial obstruction, using a threshold edge image as an input. A threshold image is obtained from an initial image, converted into a binary image. The simplest threshold method involves that every pixel value larger than a threshold is evaluated to a maximum color value (white), and for every pixel color smaller than the threshold, the value is a minimum color value (black). In
Figure 2, a conceptual representation of a threshold image is presented to form a better comprehension of the method.
In the edge detection process, an edge is considered a discontinuity in the pixel’s values larger than a pre-established threshold.
The edge detection process includes mathematical approaches for identifying points in the processed image where the image presents sharp changes (has discontinuities). An edge is a local discontinuity in the pixel values which exceeds a given threshold. Moreover, edge detection is defined as an image processing technique for finding the boundaries of objects within images. The method works by detecting discontinuities in brightness. It includes a variety of mathematical methods that aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities, as outlined above. The points where image brightness changes sharply are typically organized into a set of curved line segments termed edges. Finding discontinuities in a 1D signal is, also, known as step detection and the problem of finding signal discontinuities over time is known as change detection. Edge detection is a fundamental tool in image processing, machine vision, and computer vision, particularly in the areas of feature detection and feature extraction. For appropriate threshold problem management, the first derivate is applied to detect discontinuity in the pixel values, as seen below:
If 2D images are considered, partial derivatives could be applied (the gradient vector expression) for the points defined by an
f(
x,
y) function. For scalar information, the gradient states the path of the greatest modification. A gradient magnitude is a scalar number that describes the local rate of change in the scalar field. The magnitude of an increment,
Mg, is given using partial derivatives as equation (3) shows:
For edge detection, the methods, in general, focus on identifying the magnitude of the gradient and after this step, on applying a threshold to the result. The
matchFeatures function from MATLAB [
33] was used to determine if the pill studied in the initial analysis process and the one used to represent the results are the same.
Future identification/detection involves a process of finding points of interest in an image by identifying specific locations (areas of pixels inside the image). For feature identification, these specific locations are the ones affected by a minimum change, meaning that by varying the analyzed part, in any direction, a significant difference in pixels values is observed.
This difference in pixel values is obtained by evaluating each pixel before and after utilizing the sum of squared differences:
where
x,
y are the coordinates of a point inside a considered surface
S and
u,
v is the deviation considered regarding the point.
I(
x,
y) represents the pixels value. Considering next the Taylor series for
I:
If the deviation introducing
u and
v is reduced in value, a first-order approximation could help correct the feature detection:
where
and
.
The deviation could be then computed by a sum of derivatives as below:
Pill tracking is realized through the Kanade–Lucas–Tomasi algorithm [
34]. This algorithm uses spatial intensity information to direct the search for the position that yields the best match; a local search is made using gradients weighted by an approximation to the second derivative of the image. The algorithm is implemented in the
vision.PointTracker function from MATLAB
®.
The images were taken with the Logitech C920e camera [
35], which is capable of capturing video at full HD 1080p quality with 30 fps frame rate.
Figure 3 represents a block diagram of the newly designed flexible solution.
3. Results
3.1. Data Extraction
The first stage in employing the newly proposed flexible method is to capture the reference image and evaluate the pill’s capacity to be split. The results of the Hough transformation are evaluated for this purpose.
Figure 4 depicts the end outcome. It is assumed that the pill can be separated if the algorithm finds at least four-line segments in the processed picture. The detection performance is 80% after 20 tests.
If the Hough transformation [
29] fails to detect the presence of a line, an error message similar to the one shown in
Figure 5 is issued.
In the following phase, the weight detection technique is then applied. The graphical interface for determining the pill standard weight is shown in
Figure 6. The OCR method from MATLAB [
28], described in
Section 2, is utilized for extracting the information. The detection accuracy is equal to 30% after a set of 20 tests was completed using the Logitech C920e camera [
35]. As a result, the user must confirm the outcome via a customized notification.
The “Recommended Dose” button in the graphical interface is pushed to determine the dose recommended by the medical professional; the result is shown in
Figure 7.
For an accurate reading, the camera needs to record only one number, used as a reference for the variable in question. As a result, to ensure correct weight identification, the user is notified of the read values. In the case of error detection, hitting the “Reset operation” button will restart the procedure.
For increasing the performance of the original image-specific feature extraction required in the training process, a colorful backdrop was utilized. The image’s backdrop is represented as a 2 × 2 cm2 surface, split into four rectangles by two lines intersecting in the middle. As a result, the user is prompted to position the pill in the middle of the surface to improve the likelihood of proper feature identification.
If the examined image is blurry and does not allow for a precise identification of the pill’s characteristics, a new error message is created, as shown in
Figure 8.
3.2. Results Engenderment
The first stage in results generation is the ratio computation between the prescribed dose and the medication standard weight. For this purpose, the data recorded in the preceding subsection’s stages are used. Based on the previously set ratio, the proportion between two surfaces is calculated: One area reflects the currently delivered dosage and one corresponds to the dosage for later assimilation. The two surfaces are generated based on the estimated percentage, using a diagonal matrix.
As a result, depending on the drug specificity, the pill can be continuously divided into two halves in many ways. A black image will overly, for the patient, the delivered segment, while the remaining portion will remain transparent.
If the pill can be split, the first image is presented in
Figure 9, after clicking the “Reference image” button. The last step is for the user to hit the “Start operation” button.
The authors’ algorithm examines the initial snapshot of the pill, as shown in
Figure 9, to see if it is identical to the one used in the first stage of training the program. To consider the two tablets identical, the matchFeatures function in MATLAB [
33] was used. If this identification is not feasible, the user is notified by a message, as shown in
Figure 8, advising that the pill’s viewing angle may be changed.
If the two pills are identical, the software generates the result by constantly superimposing the two surfaces over the initially identified pill using the Kanade–Lucas–Tomasi pill tracking algorithm [
26].
Figure 10 depicts the ultimate result.
To protect privacy, the user can turn off the camera by hitting the “Stop camera” button.
Table 1 illustrates the centralization of the above-mentioned percentage results.
3.3. Performed Tests
Several experiments were conducted to validate various scenarios of application usage. The situations demonstrated through the test cases may arise in daily use, when misusing the application. The validation obtained by this type of robust testing underlines the application’s dependability.
The authors’ testing revealed that the algorithm can accurately produce the two surfaces even if the pill is not in the center of the 2 × 2 cm2 surface or if the shape of the pill is not spherical. If the dividing line of the pill is positioned in a different direction from the one used for training, the same positive outcome is attained.
All of the above-described results could only be generated if the backdrop surface borders could be identified, indicating its critical relevance. As a result, situations such as exceeding the surface owing to pill size, changing the pill after the training stage, or putting the pill outside the surface, resulted in incorrect findings, as it was not feasible to obtain the two surfaces.
4. Discussion
In this paper, a flexible augmented reality solution for medication weight establishment was presented. The tablet weight and the patient suggested dose were determined using an optical character recognition algorithm. Hough transformation facilitated the splitting decision and process. Furthermore, an algorithm was created to identify the pill area as a specialist’s recommended dose. For the pill identification component, impressive, unexpected results were obtained utilizing a single image for training, by employing a 2 × 2 cm2 backdrop surface for contrasting hues.
Extending the manufactured suggestion for pill splitting in half, the new flexible approach allows the pill division to be split into a larger number of pieces.
To prevent the decisive impact of the neural network training process, several independent mathematical techniques were utilized. The Hough transformation for finding pills that can be split, an optical character recognition method for recognizing digits, and the Kanade–Lucas–Tomasi object tracking technique are some examples.
Although the detection rates in the tests were lower than those reported in the literature (30% compared to 80%), for numbers detection with an object character recognition algorithm, and 80% compared to 95.7% for lines detection with the Hough transform, the proposed solution was tested exclusively using a webcam that records video at a resolution of 1080p and 30 frames per second. By increasing the quality of the processed pictures, an enhancement of the aforementioned percentages is expected. As a result, in terms of future development goals, using a camera capable of capturing 4K video represents a different setup. In addition, the evaluation of an application built natively for mobile devices in future phases is another goal, based on future integration requirements.