1. Introduction
A handwritten signature is one of the most commonly used authentication techniques for banks and the judiciary [
1]. The handwritten signature belongs to behavioral biometrics, which refers to the unique behavioral characteristics that can be used for human authentication. User authentication based on behavioral biometrics has characteristics such as stability and simplicity [
2]. Biometric recognition is defined as the automatic identification and verification of persons. Verification is the mode in which the final result is only yes or no, and identification means authenticating a person’s identity. The related research on handwriting recognition can be divided into handwritten text recognition, handwritten signature recognition, etc. Handwritten text recognition refers to the process of automatically recognizing the text written inside an image of a text line, a paragraph, or even whole pages [
3]. Handwritten signature recognition is the identification of the signer by the signature. The research in this paper focuses on user identification.
Offline handwritten signature recognition has been a challenging task for computer vision. Offline signature recognition can be addressed with writer-dependent and writer-independent approaches. Writer-dependent (WD) means that the classifier is trained separately for each person’s sample. In other words, when a new user comes in, it needs to be trained independently for that user. Writer-independent (WI) means that the classifier’s training is separate from the testing, which means that the user used for testing can be untrained. WI [
4] is used when there are fewer signature maps available for training, although this method also has the potential to miss many writer-specific features. Therefore, the WD method was used in this paper.
Recently, some attempts have been made to solve the problem of failure of feature selection methods performed on a few features. Sharif M et al. [
5] presented a new feature optimization selection technique employing a genetic algorithm to select the optimal features directly based on the fitness function of the features. A new extraction method was also introduced, and good results were obtained.
Hadjadji, B. et al. [
6] proposed a system for open handwritten signature recognition and combined a classifier of curvilinear wave transform and OC-PCA. In a natural environment, there will not be many signature images used for training, and the writers have studied in WI mode. Therefore, they proposed a new multi-individual OHSIS combination method for density estimation to achieve an efficient system. A design protocol was also presented to select suitable parameters for the new writer.
Mshir, S. et al. [
7] put forward a novel signature verification and recognition technique using two datasets to train the pattern through a Siamese network. Offline signature verification uses a convolutional Siamese network, and using the Kaggle dataset, the final recognition rate was 84%. Tests performed on the cross-domain dataset show that the network could handle forgeries in several languages and handwriting styles.
There were also some studies that have been validated using different classifiers. Elssaedi M. M. et al. [
8] used five well-known classifiers, such as gradient augmented trees, extracted dynamic and static features (width, height), and utilized RapidMiner for feature selection. Experiments show that on the dataset used, the best classifier for recognition is a neural network, with an accuracy rate of 92.88%. We also validated the dataset in this study using random forest and KNN classifiers, respectively, to verify the proposed method’s effectiveness.
Jagtap, A. B. et al. [
9] preprocessed the image first and then extracted the upper and lower envelope features from the preprocessed image. In this, the upper envelope features were extracted by scanning each column of the image from top to bottom. Finally, the extracted features were jointly fused and experimented with on the SVC2004 dataset, and an accuracy of 98.5% was obtained.
Several studies have been conducted to address the issues of accuracy improvement and reduction in the number of required samples. Matsuda K. et al. [
10] addressed this issue by proposing a random forest (RF)-based technique—a joint segmentation verification method modification using multiple scripts signature authentications. The tactic of this technique was to perform different fusion methods for multiple signature identifiers. Gradient features were extracted, and shape features were used to represent the user’s pencil stress and speed. Finally, experiments were conducted on the SigComp2011 (Chinese, Dutch) and the SigComp2013 (Japanese) signature datasets. By comparing the three methods of signature image generation, the martingale distance of grayscale, RGB pipeline images, and the histogram of grayscale pictures are employed. Finally, the effectiveness of the presented approach was verified.
Recently, some studies have been conducted on offline signature recognition for minority scripts. For example, Zhang S. J. et al. [
11] proposed a BoVW-based feature option algorithm MRMR for offline signature verification. Visual word features were extracted, and the maximum relevance minimum redundancy algorithm (MRMR) was employed for feature selection. K means clustering method was used to cluster the signature images, and support vector machine(SVM) was used for classification. Experiments were conducted on the Uyghur signature dataset in the self-built database and the CEDAR signature dataset [
12], obtaining 93.81% and 95.38% recognition rates, respectively. The paper has been studied only for Uyghur, while our study was conducted on Kazakh, Han, and Uyghur datasets. In addition to traditional learning methods, many people have recently explored the use of deep learning methods to solve the offline handwritten signature problem. Tuncer T. et al. [
13] studied feature extraction using convolutional neural networks (CNN) and proposed an iterative minimum redundancy maximum relevance IMRMR [
14] method for the automatic selection of optimal features, and these features are utilized as an input of the SVM. Some have proposed new deep learning frameworks or models to solve the offline handwritten signature problem [
15,
16,
17,
18,
19]. To solve the problem of limited training data, the network SigNet CNN inspired by (Krizhevsky et al. [
20]) and modified (Hafemann, L. G. et al. [
21]) was proposed, in which migration learning was introduced, and the pretrained model was fine-tuned using a limited number of available feature images (target data) in between. Finally, a BF-SVM classifier was used for classification, effectively solving the offline signature identification problem.
In this paper, we adopted the WD method to address the lack of studies on minority scripts and publicly available datasets. We built a dataset containing Uyghur, Kazakh, and Han languages, and our study was conducted on these and public datasets.
Table 1 shows the quantity of data in our self-built dataset compared with the publicly available dataset. We proposed an approach based on the fusion of local maximum occurrence features and a histogram of oriented gradient features for recognition. We performed feature selection and used principal component analysis for the extracted feature vectors to improve recognition efficiency and speed. Finally, random forest and KNN classifiers were used for evaluation, and the CEDAR public dataset was used for testing to demonstrate the efficiency of the presented method.
Our signature dataset contains three subsets: the Uyghur dataset, the Kazakh dataset, and the Han (Chinese) dataset. Uyghur and Han’s datasets contain 160 individual signature samples, the same number of signers as the BHSig260-Hindi dataset. We have followed the same protocol as in GPDS to generate these signatures. Hence, 24 genuine signatures are available for each of the signers. Since we are studying signature recognition, which requires authentic signatures, only genuine signatures are captured, not forged ones.
In this paper,
Section 2 is an introduction to the self-built dataset.
Section 3 is an explanation of the proposed methods we used.
Section 4 shows the results of the experiments. Finally, the conclusion is given in
Section 5.
Figure 1 shows the main flowchart of offline signature recognition in multiple languages.