1. Introduction
As mobile devices continue to proliferate, the demand for reliable and secure authentication and access control mechanisms becomes increasingly paramount [
1]. Various biometric modalities such as fingerprints, iris patterns, faces, voices, and even ears have been explored as potential solutions [
2,
3]. However, integrating additional sensors to capture such biometric data can substantially increase the cost of mobile devices, potentially limiting their accessibility to a broader user base.
One alternative approach is to leverage existing touchscreen technology to capture biometric data, specifically focusing on earprints. Earprints have long been utilized in forensic research due to their unique identifying characteristics [
4,
5,
6]. Recent studies have demonstrated the feasibility of utilizing the distinctive shape and geometry of an individual’s ear for accurate identification, even among identical twins [
7]. This intriguing avenue of research provides an accessible and cost-effective means of biometric authentication.
While traditional earprints are typically captured using specialized hardware or modifications to mobile devices [
8], a related biometric feature known as “ear-touch” can be acquired using a smartphone’s touchscreen. Although ear-touch may exhibit lower image quality compared to dedicated earprint capture methods, it can be conveniently recorded using a standard smartphone with multi-touch capability. This makes ear-touch an appealing and practical biometric measure for widespread adoption.
This paper delves into the application of ear-touch for mobile user authentication and access control. We propose a method for capturing ear-touch data and address the challenge of missing data points, ultimately achieving high performance with respect to the equal error rate (EER).
The significance of this innovative approach lies in its potential to address several pressing issues:
Enhanced security: ear-touch authentication leverages the unique shape and acoustic properties of an individual’s ear, making it exceptionally difficult for unauthorized access.
User convenience: While security is paramount, user experience is equally crucial. Ear-touch authentication strikes a balance between security and convenience. Users can unlock their devices or access applications seamlessly by merely touching their ear, eliminating the need to remember complex passwords or carry additional hardware like fingerprint sensors.
Future-proofing mobile security: As the mobile landscape continues to evolve, so do the techniques employed by malicious actors. Ear-touch authentication represents a forward-thinking solution that anticipates future security challenges. Its adaptability to emerging threats positions it as a long-term and sustainable authentication method.
The structure of this paper is as follows: In
Section 2, we present the problem statement and provide a comprehensive review of the pertinent literature, including the limitations of existing verification systems based on earprints.
Section 3 outlines the ear-touch database we have meticulously collected, detailing the measurement procedures and discussing the issues encountered in dealing with missing data. In
Section 4, we present our innovative solution, substantiating our approach’s effectiveness. Following this, in
Section 5, we engage in an in-depth discussion of our findings and present the results we have obtained. Lastly, we summarize our conclusions in
Section 6, emphasizing the potential implications of our work in the broader context of mobile device security and user authentication.
2. Problem Statement and Its Literature Solutions
This section has two subsections: the problem statement and its literature solutions. We present the problem, and then we explain what kinds of similar research have been undertaken recently.
2.1. Problem Statement
An ear-touch is a new biometric feature being introduced in this research. To capture an ear-touch, a smartphone’s multi-touch screen is used. As a result, ear-touches could be used for authentication on mobile devices in the same way as fingerprints, facial images, iris prints, etc. Ear-touches have not been used yet as a biometric measure in any research, though similar research, called earprint recognition, has been carried out on mobile devices. In this research, we propose a method for an ear-touch recognition system on mobile devices. With ear-touches, we have another problem called ‘missing points’. It occurs due to the physical features of ears and the way that people press smartphones to their ears. Each captured ear-touch could have either the same number of touched points or a different number of touched points. In this research, we also take the missing points into account.
2.2. Literature Solutions
Touchscreens have been used to capture biometric data already but by making changes inside the smartphones. In [
9], the authors used touchscreen to capture fingers, fists, ears, and palms through their system called Bodyprint. The capacitive touchscreen sensor was used to capture biometrics from 12 users. The touchscreen had an input resolution of ~6 dpi and the sensor was able to capture an image with 27 × 15 pixels and a size of 8 bits. They used Speeded Up Robust Features (SURF) descriptors for feature extraction. For matching features, the L2 distance was used based on extracted 12 key frames by SURF descriptors. The performance of the method was tested on 12 participants in 12-fold cross-validation. Overall, they had 864 samples despite the fact that the false rejection rate was 26.8% for all biometrics (fingers, fists, ears, phalanges and palms) in this evaluation. For just ears, the false rejection rate was reported as 7.8%. Touchscreens were used in [
10] as a sensor for capturing biometric data to authenticate users on mobile devices. In general, touchscreens were not used to scan presentations on it, but in order to obtain information from the touchscreen sensor as image data, the Android kernel of mobile devices was manipulated. Therefore, the touchscreen was changed to capture all touch points on the capacitive screen. The resolution of the touchscreen was 6 dpi or 27 × 15 pixels. The database used contained 1520 images, collected from 37 subjects, within which 40 images were taken from each subject. Recall and precision were reported as performance evaluation metrics and were 0.5960 and 0.8761, respectively.
The paper referenced in [
11] introduced a novel earprint database, EINTU, and a Deep Learning model [
12], DEL, for personal verification using earprints. It overcomes challenges in dataset creation and achieves impressive recognition accuracy rates. The study opens up possibilities for enhanced biometric security in mobile phone communications based on earprint recognition.
In [
13], the use of earprints in the forensic field, and the stability of and variability in ears for earprints were reviewed. They showed the substantial features in earprints that can be used in forensic identification. In addition, in [
14], a Forensic Ear Identification (FearID) earprint identification system was proposed and 7364 earprints were collected from 1229 participants. The weighted width comparison (in this method, the connected structures are determined to supposedly represent the imprints; then, they weight the corresponding intensities to extract local intensities as a signals), vector template matching (The method used to carry out print comparisons was based on the anatomical annotation of earprints, analyzed through the vector template matching (VTM) method. Following its anatomical annotation, each print has a template consisting of labelled points representing earprint landmarks and minutiae, distinguished into different classes. Prints are compared by assessment of the similarity between their templates.), and angular comparison method (it consists of the comparison of signals keeping track of the angle of the medial axes corresponding to the connected structures with the
x-axis) were introduced for feature extraction. They used a logistic regression for classification and the data were split into two parts (training set and testing set). An EER for the test set of 9.3% was reported. In [
15], the authors proposed a hybrid method based on global and local features for earprint features’ extraction. On the one hand, binary images and a comparison between the model and the query earprint were used to find global features. On the other hand, the scale-space extrema detection technique (the Difference of Gaussian function is calculated in order to identify potential regions that show characteristics invariant to scale and rotation), keypoint localization method (The local maxima and minima of Difference of Gaussian is evaluated by comparing it to its 16 neighbours. A candidate point is selected only if it is larger or smaller than all its neighbours. The candidate points with the lowest contrast are rejected after performing a detailed fit to the nearby data for location, scale, and ratio of principal curvatures in the pattern.), and orientation assignment method (each keypoint is represented by 16 orientations obtained from the local image gradient directions) were used to extract local features. The proposed method is applied on the FearID database and the EER was reported at 1.87%.
To recognize the extracted ear-touch via touchscreen on mobile devices, it is necessary to know how to match two-dimensional finite set points as a result [
16]. There are various algorithms that have been used to match two-dimensional finite set points. In [
17], to find exact patterns of points to match two sets, they found the centre of each set and then calculated polar coordinates based on the centroids. In another paper [
18], the authors used the one-to-one matching method, although this method needs exactly the same number of points for two set points.
Overall, the studies reviewed suggest that ear-touch based mobile user authentication has potential for practical application in mobile device security.
3. Ear-Touch Database
Touchscreens are one of the sensors used to capture data, in particular biometric data. They have been used in many real-world applications, such as Bodyprint on the Yahoo smartphone [
9].
Figure 1 shows structure of a human ear.
Figure 2a shows how to capture the data [
19]. As shown in
Figure 2b, the subject holds a smartphone to their ear, and then a mobile application captures all possible touch points touched on the touchscreen. As a matter of fact, the outer ear (shown in
Figure 1) is composed of different parts including the tragus, antitragus, helix, root of helix, crus of helix, antihelix, lower crus of antihelix, lobule anterior notch, navicular fossa, crus of antihelix, anti-helical fold, lobule, scapha and concha. However, only the first eight parts are touched or captured by a touchscreen, an example of which is shown in
Figure 2b. The touched points are extracted and considered biometric features to authenticate an individual. To collect touchscreen ear biometric data, a mobile application was created and developed in the Android Studio mobile application environment. The application was able to simultaneously take touchscreen data from several points. First, the user completed some information about them in the application. Then, the touches for the left and right ears of each subject were captured separately.
Ethics committee
As we needed participants (volunteers) to collect datasets for our investigation of an ear recognition system’s performance under presentation attack detection, the ethics committee of Warsaw University of Technology approved the experiment protocol to collect the data. The experiment was particularly designed to figure out that the ear recognition system needs an ear presentation attack detection method. Before the experiment, the participants signed consent agreements. Personal data (non-biometric data, including their age, names, gender) were also collected from the participants. The non-biometric data were maintained separately to guarantee security of their personal data. Therefore, the participants were familiar with the experiment as we described it in full detail, and the signed consent forms were gathered from all of the volunteers. The collected database is ear images taken from the participants. All those participants were of different origin (namely, Azerbaijan, Afghanistan, Algeria, China, Ecuador, India, Iran, Jordan, Latvia, Mexico, Oman, Poland, Portugal, Spain, Turkey, Vietnam, Uzbekistan) and different age groups.
WUT-Ear V1.0 database collection methodology
To acquire both ear-touches and ear photos, a Samsung Galaxy A7 was used. The Samsung Galaxy A7 features a Super AMOLED capacitive touchscreen with a resolution of 1080 × 2220 pixels. The database had two parts: ear photos and ear-touches. Therefore, ear-touches were used in the experiments. There were 138 subjects with samples of over 9000 ear photos and approximately 1000 ear-touches.
Ear-touch data associated with almost half of the participants were collected. More details about the database are presented in
Table 1. There were almost 20 ear-touches per subject. At each session, seven touches from each ear were taken, among which there were some unsuccessful acquisitions of data, considered unsuccessful attempts. To determine which were unsuccessful presentations, it was decided that sample presentations with less than four touch points would not be considered. Therefore, presentations giving four or more touch points and additional information were considered successful presentations. This rule was applied to ensure the presence of enough touch points to carry out the calculations.
According to the hardware limitation of touchscreens on the mobile devices based on the Android operating system, we could not acquire more than one set point from each part of the ear. The details of the dataset are given in
Table 1. According to
Table 1, there were 57 subjects—40 men and 17 women. It is worth mentioning why we have missing numbers. If two ears (left and right ears) were taken for each subject, we would have a total subject of 92. Despite the higher number of male subjects, we had a greater number of data acquisitions from female subjects, as there were more unsuccessful attempts with the males’ data collection.
The purpose of
Figure 3 is to illustrate the distribution of ear-touch interactions among the participants in our dataset. Each bar on the graph represents the percentage of participants who engaged in a specific number of ear-touch interactions during the data collection phase. The horizontal axis denotes the number of ear-touch interactions, ranging from one to the maximum number recorded. Meanwhile, the vertical axis represents the percentage of participants falling into each category of ear-touch interactions. Our dataset encompasses a diverse range of participants, with some individuals exhibiting a higher frequency of ear-touch interactions than others. For instance, a small percentage of participants were observed to have a substantial number of ear-touch interactions, with some reaching up to 30 interactions.
It is worth noting that while our data collection process allowed for the recording of single ear-touch interactions, the focus of this analysis is primarily on participants with multiple ear-touch instances. Thus, the graph highlights the distribution of participants with at least two ear-touch interactions. As depicted in the graph, a notable proportion of participants (e.g., 5%) were found to have 10 ear-touch interactions or fewer, while an even smaller percentage (e.g., less than 1%) exhibited exceptionally high numbers of ear-touch interactions, such as 30 instances. By presenting this distribution, we aim to provide insights into the variability in ear-touch interactions among participants, which is essential for understanding the usage patterns and potential applications of our proposed ear-touch-based authentication system. The participants’ ages ranged from 18 to 60 years old.
In
Figure 4, we present the distribution of the maximum number of set points obtained from ear-touch interactions. The vertical axis illustrates the frequency of ear-touch interactions, while the horizontal axis represents the maximum number of set points detected within each interaction.
Our data collection methodology allowed for the capture of ear-touch interactions with varying numbers of set points. However, to ensure robustness and reliability in our analysis, we focused on interactions with a minimum of four set points. Consequently, the range depicted in the graph spans from four to eight set points, reflecting the diversity of the set point configurations observed in our dataset. This approach enables us to explore the relationship between the number of ear-touch interactions and the complexity of set point patterns, providing valuable insights into the characteristics of ear-touch interactions and their potential implications for authentication systems.
Collection of Measurements, Procedures and Problems (Missing Data)
In this section, the data collection experiments are explained. To collect touchscreen ear biometric data, we created a mobile application, as shown in
Figure 5. It was developed in the Android Studio mobile application environment. The application is able to take touchscreen data from several points simultaneously. First, the application asks the user for some information about them (
Figure 5a). Then, as shown in
Figure 5b, ear-touches are captured for the left and right ears of each subject separately.
Like a fingerprint recognition system, the ear-touch needs to be enrolled several times. At each session, we took seven touches from each ear, though some data acquisition attempts proved to be unsuccessful. We considered these unsuccessful attempts. In total, we had 1427 captured ear-touches. A total of 467 ear-touches had less than four set points. To determine which ones were unsuccessful presentations, we ignored presentations with less than four touch points. So, if the presentation registered four or more points of coordinates and information, it was considered a successful presentation. It means we had 960 successful captured ear-touch presentation.
Missing data problem:
In ear-touch data, we are faced with the problem of missing points. This means that the touches do not have a fixed number of points in each presentation. Normally, during data acquisition, the distances between the points are not small.
Figure 6 shows an example of missing points in various touches. It should be noted that in order to find the best matching algorithm, we must consider the problem of missing points, because even if we solve all translation and revolution problems, the missing point’s problem can still lead to significant errors in the matching.
First problem: when looking at Ear-touch 1 and Ear-touch 2 in
Figure 6, they have the same number of points and they are from the same ear, but because those points are in a different region, our matching algorithm does not recognise it as the same ear.
Second problem: If we consider Ear-touch 1 and Ear-touch 3, there are a different number of points. So, in Ear-touch 3, there are points that do not appear in Ear-touch 1.
4. Materials and Methods
The ear-touch verification system is shown in
Figure 7. The input—a number of set points with (x, y) coordinates, is presented. Set points’ information would be features for the next step in our ear-touch verification procedure. The similarity function is calculated based on
(presented input) and
(the stored features of the ear-touch in the server or database). These scores are compared and if Th (threshold) is smaller than
, then it would be accepted.
4.1. Alignment of the Ear-Touches
In this section, we explain our solution method and the results we obtained by applying this method. Our solution has two major parts: “matching” between given ear-touches and a given template, used for authentication; and “template creation/extraction” based on a set of related ear-touches (e.g., ear-touches of some known subject), which can be used as a reference for authentication purposes. These are two basic tasks in our proposed method and there are several challenges to performing these tasks.
The first challenge is in connection with the “missing points”. In each experiment, the touchscreen will return up to eight set points based on the ear contact position with respect to the device coordinate system. There is no guarantee that all the points will be measured in each experiment, and each experiment may contain some missing points.
The second challenge concerns “permutations”. For each ear-touch, the sensor returns a “set” of points. They have no consistent or meaningful anatomical order. So, when matching between two sets, the algorithm should pair the points one by one. In fact, it considers different permutations between these two sets to find the most meaningful pairing.
The third challenge is related to “rotations and translations”. The touchscreen measures the set points with respect to its own coordinate system, not fixed coordinate based on the subject’s ear. So, even when comparing two ear-touches of the same ear, the algorithm should consider random rotations and translations that may have occurred during the different data acquisition phases.
To resolve these challenges, we used an optimisation-based approach that tries to find the best matches by minimising some relevant loss functions. To explain this approach, we first consider a simplified scenario, in which there are no missing points, before moving on to the more challenging scenarios, which are more suitable for real world applications.
4.2. A Simplified Scenario without Missing Points
In an ideal world, with no missing points and permutations, each ear-touch would be represented by a sequence of set points of fixed length. In this scenario, comparing two ear-touches is as simple as measuring the distance between all corresponding 2D points if a fixed coordinate system exists. Zero distance means all the corresponding points match each other, meaning that the two ear-touches match, whereas any non-zero distance mean they do not match. Such a distance measure can be defined by Equation (1), in which N is the number of set points for each ear-touch,
represents the Euclidean norm,
is the template or reference ear-touch,
is the given ear-touch we want to authenticate, and
and
represent the i-th set points of the template and the given ear-touch, respectively.
In a more sophisticated scenario, when rotations, translations, and permutations are present, we have to consider all possible modes of these transformations. So, Equation (1) is modified into Equation (2), containing an optimisation problem. In Equation (2),
is the set of 2D rotation matrices,
is the set of all possible translations presented by 2D-vectors, and
is the set of all possible permutations between
N points. It is assumed that each permutation is presented in
.
In fact,
measures the distance between T and some transformed version of
X using the
distance we defined in Equation (1). We call this transformed version the best matching form of X w.r.t. to T, which will be used to create a template later. This best matching form or “best match” can be represented by Equation (3), while its parameters are defined in Equation (4).
Pacut [
20] resolved the optimisation problem of Equation (4) based on the Procrustes problem [
21] and the Kabsch–Umeyama algorithm [
22]. His solution is straight-forward and efficient, without requiring common iterative schemes of general optimisation algorithms. However, his proposed method does not consider permutations and thus it has to be extended in order to solve our simplified scenario with Algorithm 1. It returns the mismatch based on Equation (2) and the best match based on Equation (3) without considering any missing points.
Algorithm 1: Reference Template Creation from a Group of Known Ear-Touches with no Missing Set-Points |
inputs: template T, ear-touch X outputs: mismatch between T and X based on Equation (2) best match of X w.r.t X (Equation (3)) mismatch = infinite B = X for each possible permutation P do permute X according to P, and save it as S use Pacut’s method to solve
if then mismatch = min_aux end for return mismatch, |
Creating a reference template from a group of known ear-touches was another part of our problem. So, we need an algorithm to perform this task as well. In the absence of missing points, every known ear-touch has the potential to be used as a template. However, in practice, it is preferable to construct or extract the template from a bunch of related or known ear-touches. This template can be found by Equation (5), where
is the set of all possible templates or ear-touches,
is the unknown template we are looking for,
is the set of input ear-touches and
is some mismatch measure, which can be devised as Equation (2).
Pacut [
20] also solved the optimization problem of Equation (5), although it does not consider permutations. It can simply be extended as Algorithm 2. The main idea of this method is the notion of “Best Match” introduced earlier. It is based on the properties of 2-Norm or Euclidean Norm, is which already used in Equations (1) and (2).
Algorithm 2: Expansion of Algorithm 1 |
inputs: A series of M related ear-touches outputs: A template which minimises the average mismatch to Y according to Equation (5) k = 0
while (k = 0) or () k = k+1 //according to Algorithm 1 end while return |
4.3. Matching in the Presence of Missing Points
In practice, both the template (T) and imposter (X) are variable length sequences of set points, so the one-to-one correspondence between their set points may not apply. This means that the previously mentioned algorithm for matching requires some modifications to overcome this challenge. First, a simpler scenario is assumed, in which the template is assumed to be complete, and then it is extended to the case where the template itself is an incomplete set of set points.
If the template is complete, then there must be an “injection” from a set of imposter set points to the set of template set points. We represent this injective function by a “partial permutation”. If we assume the imposter (X’) has
set points and the template (T) has
set points, then each injection can be shown by
-permutation of
, i.e., as a n-tuple
where
s are distinct and
. The prime symbol (‘) on X’ is just added to emphasise its incompleteness. If we denoted the set of all such partial permutations by
, then
can be reformulated as
in Equation (6) to consider missing points in its calculations. The optimisation problem can also be solved by some extension of Algorithm 1, which is described in Algorithm 3, allowing the Best Match to be incomplete and returning some additional output, and the index sequence
to distinguish between existing and non-existing set points. The definition of E is presented in Equation (7) and the incomplete Best Match (B,E) satisfies the equality in Equation (8).
Algorithm 3: Matching in the Presence of Missing Points |
Inputs- •
- •
Outputs- •
based on Equation (6) - •
of incomplete best match that satisfies Equation (8)
min = infinite for each Create a limited version of template as follows find using Pacut’s method Store the virtually complete Best Match of X’ w.r.t. S mismatch = if then min = mismatch end if end for return min, B, E |
If the template (T’) is incomplete, some set points of X’ may lose their equivalents in T’. So, it is important to know which set points are common in both ear-touches. If the total number of set points is denoted by N,
and
have
and
set points, respectively, and
; then, the number of common set points (
) obeys the following inequality:
. So, our algorithm should search all these possible modes of this interval, which is described by Algorithm 4.
Algorithm 4: Expansion of Algorithm 3 |
Inputs- •
N as the total number of existing set points - •
- •
Outputs- •
The minimum mismatch between T’ and X’, considering all possible modes of common set points - •
of incomplete best match which satisfied the following:
for for each Sort elements of Q in ascending order as Create a limited version of X’: find the mismatch between Z and T’ according to Algorithm 3 if then end if end for end for return min, B, E |
4.4. Creating Template in Presence of Missing Points
In Algorithm 2, the template is constructed by averaging related set points on all Best Matches on the input ear-touches. Those Best Matches are dependent on the template itself; thus, an iterative scheme is employed to reach the fixed point of equation, which is the optima of Equation (5). The same can be carried out for incomplete ear-touches as described in Algorithm 5, but there is a major drawback.
Algorithm 5: Reference Template Creation from a Group of Known Ear-Touches with Missing Set-Points |
inputs: A sequence of incomplete ear-touches with corresponding existence sequences . Outputs: An incomplete template which extends Algorithm 2 for incomplete ear-touches. Sort inputs (Y and E) by number of existing set points in a descending order and Store them in original variables. while (k = 0) or () k = k+1 (//w.r.t Algorithm 4 end while return , |
Its existing set points are limited by the first guess, and further ear-touches will not add any new set points to the template. The reason for this is the “if statement” block of Algorithm 3, which ignores extra set points of the given ear-touch in the process of creating the Best Match. This drawback can be fixed as described in Algorithm 6, but it imposes another challenge.
Algorithm 6: Expansion of Reference Template Creation from a Group of Known Ear-Touches with Missing Set-Points |
Synopsis: improving if-statement block in Algorithm 3
if then min = mismatch for each end for end if |
The vacant set points in the template are limited, meaning that in the template’s creation process, it is necessary to match and categorise all extra set points in each input ear-touch in order to aggregate them somehow. This process, in combination with the initial iterative scheme of the process, can result in a real computational burden. With this in mind, in our research, we preferred to apply a sub-optimal approach, as described in Algorithm 7, instead of the optimal strategy discussed above. The results were satisfactory and we did not develop our algorithm beyond this point.
Algorithm 7: Sub-Optimal Approach of Algorithm 6 |
inputs: A sequence of incomplete ear-touches with corresponding existence sequences . outputs: An incomplete template which extends Algorithm 2 for incomplete ear-touches. Sort inputs (Y and E) by number of existing set points in a descending order and Store them in original variables. for (//w.r.t Algorithm 4, modified by Algorithm 6 end for return , |
To ensure a comprehensive understanding of the methodology, a practical real-world example will be furnished in
Appendix A.
5. Results
The experimental scenario involved the limited ear-touch database. Each user considered in the enrollment process was chosen randomly. Then, the extracted features for the test user were calculated. We carried out the tests on our own ear-touch dataset. In the verification systems, the problem of missing points, because of the physical properties of ear, is marginal since it is always possible to acquire a proper ear-touch. Pressing for longer and keeping the touch screen on the ear takes only a few seconds and the users usually cooperate with the process. Hence, in the experiment, we concentrated on the images of the ear-touches with over four set points and ignored images with less than four.
Experiments were carried out to calculate the performance gain of using set point coordinates in a matching system. For each subject, the number of imposters and genuine matches were almost 36,315 ((270 × 269)/2) and 1110 (30 × 37), respectively. It should be mentioned that we did not consider the symmetric similarity of the same subject, or the similarity between the same ear-touches. The average time it took to create a template (feature extraction from the minimum four set points and the maximum eight set points) and matching was 0.22 s and 0.003 s, respectively. We used a PC with 8 GB RAM and a 2.6 GHz core i3 CPU. Octave was used to implement all the programs.
The query images were captured for the subjects in a similar condition. The features of ear-touches were computed and the ear recognition decision was taken based on the calculated features.
In the coming subsections, we denote the results for the proposed methods. The False Rejection Ratio (FRR) and the False Acceptance Ratio (FAR) parameters were calculated, thanks to which the Equal Error Ratio (EER) was computed [
23]. The False Match Rate (FMR) is the rate at which a biometric process mismatches biometric signals from two distinct individuals as coming from the same individual.
False Acceptance Rate (FAR): this measures the rate at which the system incorrectly accepts an unauthorized user.
False Rejection Rate (FRR): this measures the rate at which the system incorrectly rejects a legitimate user.
Equal Error Rate (EER): EER is the point where FAR and FRR are equal, and it is often used as a threshold to determine the system’s overall accuracy.
At the EER, the system is effectively balanced between its ability to accept legitimate users and reject impostors. In practical terms, lower EER values indicate better system performance, as they imply a reduced rate of both false acceptances and false rejections.
5.1. Exploratory Data Analysis
Now, let us aim to gain a deeper comprehension of this dataset. Conducting exploratory data analysis stands as a crucial phase in the model development process. Our focus will be on examining the distribution of the target class, and assessing the potential of set points to distinguish between individual identifications. The findings from this section will guide our decisions on selecting features for model training and determining the metrics suitable for model evaluation.
In this work, we have used different number of enrollment scenarios from one ear-touch in enrollments to eight ear-touches in enrollments. Therefore, we evaluate data based on these numbers of enrollment. Also, persons with single ear-touch are used just in the test set. Therefore, the training and test data are as in
Table 2.
All the data’s normal distribution is shown in
Figure 8. The figure illustrates the distribution of set points for ear-touches across 960 samples. Each sample consists of a maximum of nine set points. The individual distributions of set points are represented by the solid curves, where the
x-axis denotes the set point values and the
y-axis represents the probability density.
The overall mean of the set points, denoted by the dashed vertical line, is 424.58, indicating the central tendency of the dataset. The shaded region around the mean represents the overall standard deviation 254.22, providing insights into the dispersion of the set points. This visualization provides a comprehensive overview of the variability in ear-touches set points across samples, aiding in the understanding of the dataset’s statistical characteristics.
Let us now examine the inter-correlations among the input features and their associations with the target variable. Given that all the input features and the target variable consist of numerical values, the Pearson correlation coefficient is employed for measuring the degree of correlation. As a widely utilized metric, the Pearson correlation coefficient quantifies linear relationships between two variables, ranging from +1 to –1. A coefficient magnitude exceeding 0.7 indicates a notably high correlation, while magnitudes falling between 0.5 and 0.7 signify moderately high correlation. Additionally, magnitudes ranging from 0.3 to 0.5 indicate low correlation, and values below 0.3 denote minimal to no correlation. The pairwise correlations can be efficiently computed through the Pandas library’s corr() function. The resultant correlation matrix is illustrated in
Figure 9.
show the set points’ gathered features from smartphone and Target depicts the persons. In
Figure 9, attention is initially directed to the first column, delineating the correlations between all input features and the target variable. It is evident that the features exhibit negligible correlation with the target class. Notably, the correlation coefficients bear negative values, indicating an inverse relationship: as the feature values increase, the corresponding target variable values decrease. This observation underscores the effective mitigation of rotation and translation challenges within our approach. Moreover, the analysis reveals substantial inter-correlation among several features. Notably, features such as
X7,
X8, and
X9 display high correlations with
Y7,
Y8, and
Y9, respectively. It is pertinent to observe that the majority of ear-touches register zero values for
X7,
X8, and
X9.
5.2. Evaluate the Recognition System without Missing Points
In our dataset, we have some touches with no missing points. We have 17 users which have ear-touches with no missing points. In total, we have 72 ear-touches with no missing points. This means that these 17 users had at least three ear-touches which had the same number of set points and they were located in almost the same places. In this section, we evaluate how our proposed method works on this part of dataset.
Figure 10 indicates FMR and FNMR for data with no missing points. We used single-enrollment and multi-enrollment scenarios. For evaluation without missing data, we used just three enrollment images because there is no possibility to have more enrollment images. We can see that the proposed method achieved EER 0.037 when there is a single enrollment image, whereas it achieved EER 0.032 when there are multiple enrollment images.
Figure 11 depicts the Detection error trade-off (DET) curves for the proposed method on the ear-touches database. The DET curve has been shown to find a trade-off between FNMR and FMR.
5.3. Evaluate the System in the Presence of Missing Points
In this section, we used all datasets to evaluate the system’s performance. We have 92 users who have ear-touches with no missing point. In total, we have 960 ear-touches with missing points. In this section, we evaluate how our proposed method works on the whole dataset. The recognition outcome for the proposed method is shown in
Figure 12.
Figure 13 depicts the DET curves for the proposed method on the ear-touches database. The experiments showed that the proposed method could improve the results by about 0.17 (from 0.27 to 0.10) when eight ear-touches are used for enrollment. This result was probably achieved because in the single ear-touch, we assumed that all possible points in an ear-touch would appear in the ear-touch with the maximum points, whereas, in the method with eight ear-touches for enrollment, we considered possible points and the computations were carried out based on all set points.
5.4. Different Sample Numbers for Template Creation
We evaluated the proposed method on our dataset based on various numbers of ear-touches for template creation because we wanted to explain how the number of ear-touches could affect the results.
Table 3 depicts the comparison of performance with our proposed method in terms of using various numbers of ear-touches for enrollment in the ear-touch recognition system. There are four subsets of the enrollment model. For instance, the “Enrol 1 ear-touch” means that one and the rest of ear-touches in a subject are used for training (template creation) and testing, respectively, and so on. It was observed that the more ear-touches were used for enrollment, the better the performance that was achieved. Consequently,
Figure 14 indicates the comparison of FMR and FNMR with our proposed method in terms of using various numbers for enrollment.
Subsequently,
Figure 15 depicts the DET curves for the proposed method on the ear-touches database. The experiments showed that the mean of the cross-validation results is about EER = 0.04. As a result, the mean EER was 0.04 for all folds, which is acceptable for a recognition system.
As a whole, our experiment indicated remarkable improvement in performance since we tested different numbers of ear-touches as enrollment presentations using the proposed method. It was shown that the method provided precise and additional information and might be used for authentication and control access on smartphones. The results of this research strongly suggest that considering missing points is crucial and must be considered as part of the features.
6. Discussion
The goal of this research was to introduce a novel touch-based biometric characteristic on mobile devices. This section comprises a discussion of the main findings as related to the written works on the ear-touch recognition method in mobile devices.
Our authentication system aims to verify the authenticity of users based on ear-touch images. While it may appear that a binary classifier could be a straightforward solution, the task at hand is more nuanced and falls into the realm of recognition rather than a binary classification.
The following are key points to consider:
Diversity in ear-touch patterns: Unlike traditional binary classification problems where distinct classes are well-defined, ear-touch patterns can exhibit significant variability among individuals. This diversity makes it challenging to define a single binary boundary that separates authentic from non-authentic users.
Enrollment for recognition: To effectively capture and adapt to the diversity mentioned above, our system employs an enrollment phase. During this phase, the system learns and recognizes unique features in each user’s ear-touch pattern. This allows for a more personalized and accurate recognition process, enhancing the overall security and reliability of the authentication system.
Recognition as a multiclass problem: In essence, our authentication system can be better conceptualized as a multiclass recognition problem rather than a binary classification problem. Each enrolled user represents a distinct class, and the system’s task is to correctly identify the user among the enrolled set.
Achieved equal error rate: The reported average equal error rate of 0.04 in our study underscores the effectiveness of our approach. This metric accounts for both false acceptance and false rejection rates, providing a comprehensive evaluation of the system’s performance in a recognition context.
This section concludes with a discussion of the limitations of this research, areas for future research, and a short summary. The following are questions related to a discussion and future study possibilities of this research:
Question 1: What motivates ear-touch person authentication to be useful on mobile devices?
Question 2: How could the used method for alignment of the ear-touches show the performance of the biometric characteristics?
To answer the first question, let us consider the existing ear-touch biometric characteristics. A proposed method in [
16] shows a similar biometric system but with a different type of data equation. It scans all ears with a specific screen, which is not possible on a normal smartphone. This means that the authors changed the kernel of the mobile devices. Therefore, to make an ear-touch biometric system available, we used the touch screen in the normal mode. According to the results, we could use the acquired data as biometric characteristics.
The method proposed in [
20] for the alignment of the ear-touches achieves an equal error rate 0.04. We might consider it a biometric system even for identification but it needs to be evaluated on a large database.
This biometric system has an advantage in terms of acquiring data, which can be easily carried out. It could be installed on all multi-touch screen mobile devices. However, the number of features obtained as a biometric characteristic system might be lower.
7. Conclusions
In this study, we have introduced a novel biometric authentication approach tailored for mobile devices equipped with multi-touch screens. Our proposed method leverages the distinctive characteristics of ear-touch patterns, offering a seamless and secure method for individuals to authenticate themselves. By harnessing the inherent capabilities of the multi-touch screen as a sensor, we have developed a robust authentication system.
Our efforts encompass the creation of a comprehensive database consisting of 92 subjects and a total of 960 ear-touch images. To facilitate the extraction and matching of these unique ear-touch features, we employed a theoretical method as outlined in [
20]. Remarkably, our methodology yielded an impressive equal error rate (EER) of just 0.04, underlining the effectiveness of our approach.
In conclusion, this research serves as an important stepping stone for the exploration and application of a novel biometric characteristic acquired through multi-touchscreen technology, which is increasingly prevalent in the majority of mobile devices. The potential implications are far-reaching, promising enhanced mobile security and user convenience, thereby setting the stage for further advancements in this promising field.