In digital healthcare systems, with digitalization, data can be easily accessed. Considering the sensitivity of confidential information, the need for security is accelerated during this time. One of the most important security aspects is authentication which should be utilized. The available authentication models that rely on Machine Learning (ML) have some shortcomings, such as difficulties in appending new users to the system or model training sensitivity to imbalanced data. To address these problems, we propose an application of the Siamese networks using ECG signals which are easily reachable in digital healthcare systems. Adding some preprocessing for feature extraction in such a model could lead us to prominent results. This model is performed on ECD-ID and PTB datasets and approaches 92% and 95% of accuracy, respectively. A combination of simplicity and high performance made it an exclusive choice for smart healthcare and telehealth.
HIPAA: Health Insurance Portability and Accountability Act; PINs: Personal Identification Numbers; EEG: Electroencephalogram; PPG: Photoplethysmography; ECG: Electrocardiogram; ML: Machine Learning; DNN: Deep Neural Network; CNN: Convolutional Neural Networks; LTSM: Long Short-Term Memory; RNN: Recurrent Neural Networks; EER: Equal Error Rate; FFT: Fast Fourier Transform; DFT: Discrete Fourier Transform; PTB: Physikalisch-Technische Bundesanstaltl; ECG-Identification (ECG-ID).
The rapid growth of digital technologies, telehealth, and other digital healthcare systems have become noticeable worldwide topics. The use of telehealth has grown rapidly, especially during the pandemic. So that the first quarter of 2020 in the U.S. illustrated a 154% increase in telehealth visits [1]. Maintaining information about different users in digital healthcare systems is crucial. For instance, in the U.S, the Health Insurance Portability and Accountability Act of 1996 (HIPAA) was designed to preserve patient data. HIPAA Privacy Rule requires that covered entities apply reasonable safeguards. This is vital due to support the privacy of Protected Health Information (PHI) from impermissible uses or divulgence [2].
Regarding the sensitivity and nature of data processed over time in different fields, companies focus on providing a secure environment to clients. However, some security drawbacks have been indicated these years, such as illegal access. This leads to finding various ways to prevent someone from accessing the private information of individuals [3]. Also, there are generally accepted security aspects like confidentiality, data integrity, auditing, etc. One of the most prominent proactive methods is authentication. Authentication alludes to the process of affirming the identity of objects. In authentication, a user proves their identity by providing the credential to check with the stored credentials in a database or repository of authorized users [4].
Recently, increments in falsifying events have constrained a scale-down within the adequacy of conventional verification strategies such as security tokens, passwords, Personal Identification Numbers (PINs), etc. To address this issue, security researchers have evolved biometric-based authentication systems with higher recognition rates to replace traditional authentication systems and prevent accessibility in illegal ways. In recent years, different biometric features such as fingerprint, iris, Electroencephalogram (EEG), Photoplethysmography (PPG), and Electrocardiogram (ECG) [5] are utilized for biometric authentication.
Machine Learning (ML) is one of the most successful tools used in various fields to solve problems. ML methods are adopted to develop a verification model for user identification [6]. Our model will do it by getting the user’s ECG dataset. In summary, authentication approaches based on ML employ specific aspects in the multi-dimensional domain. This leads to some benefits, such as a cost-effective, more reliable, independent model from humans, faster, and more precise.
Different ML models have developed biometric-based authentication systems. In many studies, Deep Neural Network (DNN) was applied [7]. The major advantage of DNN is that they can extract the best feature set related to raw data and do not need much more effort in this regard. Supervised models, such as DNN-based classifiers, are widely used for authentication using either of these network structures [8]. Also, Autoencoder is an unsupervised artificial neural network that learns how to efficiently compress and encode data and then learns how to reconstruct the data back from the reduced encoded representation to a representation that is as close to the original input as possible [9]. Using such DNN-based models for authentication could lead to some shortcomings, as below:
Needing a large amount of data for model training [10].
Sensitivity to imbalanced datasets is prevalent in available datasets [6].
Enrolling a new entity in the model requires training the model from scratch and applying new hyper parameters after retraining.
Siamese networks are a specific type of DNN that can address the mentioned issues. Siamese networks consist of two or more [11] same subnetwork components. The major idea behind Siamese networks is that they can learn useful data descriptors that can be further used to compare the inputs of the separate subnetworks. They are utilized in different studies, such as facial and palm print recognition [12].
The second part of Siamese networks just evaluates the similarities between two inputs, so it does not require big data for model training. Instead, some couples of samples from the same person and different persons can constitute the training samples. To the above description, a new entity is added to our model without retraining since the model calculates the similarities.
The activities related to humans are more likely to have imbalanced records. This is mainly because of their frequent nature. Hence, it is more arduous to train classifiers from imbalanced human activity datasets. While ML algorithms perform well on balanced datasets, they cannot provide good performance on imbalanced datasets [13]. This is where the Siamese networks can play a positive role. One of the advantages of this method is presenting good accuracy when classes are imbalanced [14].
In this research article, we developed an authentication application by using ECG in Siamese networks. This can provide some mentioned advantages for authentication in digital healthcare systems and resolve our requirements. The main contribution of our proposed method is to apply the required preprocessing to use ECG signals in Siamese networks and provide a repository for registered users. This way, enrolling any new user to be authenticated in the system does not need retraining the model. Also, by this preprocessing, the model is no longer sensitive to imbalanced data conditions and can handle authenticating individuals with any number of samples. Also, applying a Fourier transform to input signals can help to extract more reliable features from samples to train a robust network.
The rest of this paper is classified as follows. Section 2 discusses related work and investigates prominent research studies in this realm. We present the material and methods in Section 3. Afterward, in Section 4, we introduce the used dataset for evaluating the proposed model and illustrate the results of our implementation. In the final Section, we conclude our work and provide some ideas to improve this method in future work.
The ECG is a reliable and studied medical signal which records the electrical signal of muscle fibres in different areas of the heart. This signal is widely used for detecting different heart conditions or cardiac (heart) abnormalities. Since there is access to this feature in most digital healthcare systems and also new mobile devices and regarding its uniqueness for each person, it can be a good choice to be used in authentication systems in telehealth.
ECG-based authentication systems are divided into two classes regarding features that are used for recognition: fiducial and non-fiducial [15]. The fiducial methods extract six characteristic points from the heartbeat wave (P, Q, R, S, T, U). Also, these are used to extract some features such as amplitude and latency. On the other hand, non-fiducial techniques analyze the entire or a part of the ECG waveform to extract useful information. Moreover, partially fiducial is another method that combines two previous approaches [15] (Figure 1) depicts the main points of the ideal heartbeat. QRS is the combination of the three graphical deviations in the ECG signal.
Various DNN structures have been proposed for ECG-based authentication systems. One of the most popular structures from DNNs is the Convolutional Neural Network (CNN) which is used for processing data with stationary virtue, which means repeating concepts in the samples [16]. It can take an instance as input and separate one from others by assigning a label to the sample [17]. It consists of three kinds of layers: convolution, pooling, and fully connected layers. The first two can be applied to extract features, while the third one uses to match these features to the final output, such as classification [18].
The second structure is Long Short-Term Memory (LTSM), a famous type of Recurrent Neural Network (RNN). LTSMs are able to learn long-term dependence in sequence prediction problems [19]. Hence, it can be used in applications such as speech recognition and ECG biometrics [20]. LSTM can resolve the problem of vanishing gradients, and as a result, a good performance is obtained [21].
LSTM-CNN is a hybrid model used in this realm which is an LSTM architecture specifically designed for sequence prediction problems with spatial inputs, like images or videos [19]. In this compound, feature extraction and classification are performed in the same network.
The above structures CNN, LSTM, and CNN-LSTM for ECG authentication in previously written papers provided as classification [22]. They reached more than 90% accuracy in classification. They store the features before the last layer in the database. After this process, a threshold is considered in the test phase to compare each input with the previous features saved in the database. This is a way that has been used for authenticating, while in most cases, the result of final authentication models is not as good as classification.
There are two research papers focused on ECG-based authentication systems using Siamese networks. Ibtehaz N, et al. [23] claimed 99% classification accuracy by using a single heartbeat and 100% accuracy by fusing multiple beats. By utilizing Siamese architecture, they also reported that identification Equal Error Rates (EER) in their proposed model could be decreased to 1.29%. However, they did not provide any implementation code.
Ivanciu L, et al. [24] employed images as input to their proposed Siamese network structure instead of signals. They contributed to using such input data for authentication while having some difficulties in data processing and noise removal from input data. However, their innovative model could approach 86.4% overall accuracy of the system for classification.
This paper proposes an ECG-based biometric system using Siamese networks for authentication. Figure 2 depicts the schematic view of the proposed model. This model contains two main steps: preprocessing and Siamese networks. The Siamese networks also include feature extraction and comparison parts that will discuss in detail.
In this model, the Fourier series [25] is exerted on our dataset by the sines function. A Fourier series is the sum of an infinite number of sines and cosines. It makes use of the orthogonality relationships of the sine and cosine functions. The Fourier series is useful for implementing periodic signals such as ECG signals.
where:
A Fourier transform is a mathematical technique that converts the function of time to the function of frequency. We also used the Fast Fourier Transform (FFT) in our paper. FFT is a mathematical method utilized to implement the Discrete Fourier Transform (DFT). Considering the DFT of the input makes computing faster than calculating directly. Therefore, the number of computations will be decreased. Consequently, FFT is proposed in order to use this algorithm for feature point extraction in ECG signals, such as PQRST points [23,26,27].
In our proposed model, the explained FFT is applied to our datasets after exerting the Sines function. We exercise the Fourier series on both datasets. According to this, the Physikalisch-Technische Bundesanstalt (PTB) dataset is changed from 200 × 1 into 200 × 2. On the other hand, ECG-Identification (ECG-ID) dataset experienced a transformation from 256 × 1 to 256 × 2.
The data is prepared so that half of the pair of signals is related to the same person, and the other half belongs to different users. Additionally, the initial data can be balanced and used even if this set is not balanced. Therefore, this is why there is no problem with the model being unbalanced.
The inputs for the Siamese networks are composed of pair of ECG signals. The output of the entire model is the similarity value between the two extracted vectors from input signals at the end of the first section. We used the inverse of the L1-norm and Manhattan distance to determine the similarity between two inputs and a normal sigmoid function with a threshold of 0.5 in the next level to discriminate accepted and unaccepted pairs. The output will be a value between 0 and 1. The result tends to be close to 1 if two ECG signals are from the same person and close to 0 when pairs of signals are different, which means that two input signals have been driven from two non-identical individuals.
According to figure 2, the two CNN have an identical structure, and the parameters are shared between these two parts of the network. Turning to the functionality of the available layers, max pooling is calculated as the largest value for each region of the feature map. After adding this to the model, it leads to a reduction in the number of dimensions. The primary mission of these networks is to obtain the extracted feature vectors from the signals. Afterward, the two vectors will be compared by finding their absolute differences. The final step is exerting the sigmoid layer to obtain the similarity value.
Registration in the authentication system requires storing one sample from each certificated user in the system repository. To this end, after applying preprocessing step, extracted vector by the CNN part will be saved for the next comparison. For this reason, there is no need to change the model and train it again for adding and registering a new user in the system. To this end, storing the sample with the indicated conditions will be sufficient.
In the test phase, one of the network branches is eliminated. In other words, as the two network branches share the same parameters, saving just one of them is enough. Figure 3 shows the final trained model. In the following, the new input is passed through that single path, and extracted vector from it will be sent to the comparison part to be compared with the data from the same person kept in the repository. In the final step, regarding the similarity value, the model can accept or reject the input sample.
This section will discuss the experimental setting and results of applying the proposed method on two benchmark datasets. To this end, we will introduce datasets and metrics we need to evaluate the final model. Also, we investigate the hyperparameters used in the final model, then compare and discuss the implementation results.
The ECG-ID Database and PTB dataset were chosen to evaluate our proposed model. Regarding ECG-ID, the database contains 310 ECG recordings obtained from 90 persons, 44 men and 46 women. The PTB dataset consists of 549 records from 290 signals. Each ECG signal is digitized at 500 Hz with 12-bit resolution over a nominal ± 10 mV range. On the other hand, each PTB signal is digitized at 1000 samples per second, with 16-bit resolution over a ± 16.384 mV range. Table 1 concludes the features of the mentioned datasets. In this paper, we used signals to train Siamese Network. Figure 4 depicts samples of the ECG-ID and PTB signals (left and right, respectively).
Table 1: Specifications of the benchmark datasets, including ECG signals for authentication. | ||||
Database | # of Records | # of Subjects | Resolution | The Nominal Range |
ECG-ID | 310 | 90 (44 men and 46 women) | 12-bit | ± 10 mV |
PTB | 549 | 290 (209 men and 81 women) | 16-bit | ± 16.384 mV |
Accuracy is used to evaluate how many data samples are predicted correctly in our model. Our accuracy standard in this article is the sigmoid function, which has a value between 0 and 1. Cross-entropy loss is the measurement criterion for rearranging model weights during training. The output is the probability or the similarity criterion between 0 and 1.
Figure 5 depicts the overall view of the proposed model and its parameters. In our paper, we exert the Fourier series on both datasets. This process will convert the 200 × 1 signal in the PTB dataset to 200 × 2, and the ECG-ID dataset will cause the transformation from 256 × 1 to 256 × 2. Figure 6 shows the detail of the sequential part.
Regarding more detailed information from the training phase, each signal is entered into a CNN model in figure 6. Each model includes eight 1D convolutional layers with ReLU activation function so that there are 32 filters for the first two convolutional layers, 64 filters for the next two, 128 filters for the next to and the last two layers have 256 filters. The size of the kernel is 3, and the Max-pooling layer is used between each convolutional layer with the same padding. Then, the units are flattened, and the dense layer of 512 neurons is applied. In the last layer of the model, the network uses the sigmoid activation function, which calculates the Euclidean distance between two signals. The L2 regularization is applied.
The model utilized the binary cross entropy loss function and the Adam optimizer. In training mode, we use half the pair of the same and half the different signals.
Table 2 presents the results of the proposed model on two mentioned datasets. Our model shows 92% and 95% of average accuracy for authentication on ECG-ID and PTB datasets, respectively. Comparing the results of the proposed model with two state-of-the-art models, CNN-LSTM, LSTM and Bi-LSTM shows that the proposed model outperforms the rivals. To be fair, we exerted 5-fold cross-validation in splitting created dataset to train and test. There is around 3% tolerance in results because of randomness in creating couples of samples from the same and different persons.
Table 2: Comparing the accuracy of the proposed model with CNN-LSTM and LSTM on ECG-ID and PTB datasets for authentication. | ||
Paper | Accuracy ECG-ID | Accuracy PTB |
The proposed method | 92.1 % | 95.3 % |
CNN-LSTM | 76.8 % | 81.4 % |
LSTM | 82.7 % | 88.0 % |
Bi-LSTM | 84.1 % | 88.9% |
The most rival methods in authentication research studies originally classify samples, while our model can be used directly for authentication. The ability to work on imbalanced datasets in order to produce the pair of data and model robustness regarding using a CNN model for feature extraction are other advantages of the proposed method.
In this article, we proposed Siamese networks for ECG authentication. This model can directly encode the data to store in a repository. Therefore, data vulnerability can be avoided by accessing the unknown entity. While authenticating an input signal, it will be passed through a CNN model, and the comparison section of the network will compare the corresponding extracted features with the claimed sample vector in the repository and calculate their similarity. As a result, adding a new user does not need to retrain the model. We performed the experimental evaluation of our model on two benchmark datasets that outperform other conventional methods of direct authentication.
There is no conflict of interest with other works.
SignUp to our
Content alerts.
Are you the author of a recent Preprint? We invite you to submit your manuscript for peer-reviewed publication in our open access journal.
Benefit from fast review, global visibility, and exclusive APC discounts.