ECG Based Person Identification System and Personalized Heart Wave Generation
- 4 minsIntro
Your heart may reveal more than your health condition. This is true, especially in the context of bio-signal biometrics and authentication systems. Besides heart rate and pathological symptoms, the electrocardiograph (ECG) can show your indentity as well.
To investigate and demonstrate the potential of ECG biometrics, in this project, a comprehensive ECG biometric system was proposed, in addition, the vonerability of ECG biometrics was explored by ECG synthesis using generative models (e.g., VAE and GAN).
Related works were published in Replicating Your Heart: Exploring Presentation Attacks on ECG Biometrics.
Summary
- Designed comprehensive wavelet distance based authentication with replay attack detection.
- Implemented black box lifelike fake ECG templates generation using GAN and VAE, achieving 99% and 95% success rate on attacking the ECG authentication system.
Motivaiton
Emerging biometrics utilizing diverse modalities of bio-signals have drawn huge interests in both industry and academia. Among them, electrocardiogram (ECG)-based biometrics is standing out quickly. ECG is the bioelectrical signal arising from the contraction of the heart muscles, the ECG waveform largely depends on the shape, size and structure of the heart. In comparision with commonly used physiological (like fingerprint and face) and behavioral (like voice and gaits) biometrics, ECG biometric tends to be a safer solution because it is an internal signal of human body that is not directly observable and only presents in living individuals. More importantly, ECG signals exhibit a small level of intrinsic dynamic variance. In other words, even for the same individual, there is no identical heart waves. Therefore, it is more resistant to conventional presentation or replay attacks.
However, unlike other conventional biometrics, the security vulnerabilities of ECG biometric systems have been greatly under-explored. The multi-faceted roles of ECG signal and its growing number of on-line databases could intensify the concern on privacy and security.
Assumptions
-
Assumption I: Some ECG biometric databases are publicly available online and can be freely accessed.
e.g., MIT- BIH (48 subjects), PTBD (290 subjects), CYBHI (128 subjects) and UofTDB (1,012 subjects).
-
Assumption II: The malicious attacker has no prior knowledge of genuine users.
-
Assumption III: The attacker has no access to the target authentication system (TAS).
The attacker can only obtain the authentication decisions (Accept/Reject), instead of any detailed matching score or intermediate information.
-
Assumption IV: TAS is able to effectively defend against the noise injection and replay attacks.
Target Authentication System (TAS) Proposed
- Verification System
- R-peak detection: Pan-Tompkins Algorithm
- Feature extraction: discrete wavelet transform
- Decision is made based on 5 beats
- Replay Detector
- Residual Reply Detector (for SNR < 30): to defend against noise injection attack. Based on the distributions extracted, a Random Forest Ensemble Classifier (RFEC) is trained and implemented. white Gaussian noise injections under different Signal Noise Ratio(SNR)
- Low Noise Reply Detector (for SNR > 30): to prevent the case where some stolen sample are repeatedly presented or extremely small noise is injected
Generative Model for Bio-signal Synthesis
- Substitute Sample Searching (Given a database with a large population, finding a substitute ECG sample is possible)
- Local binary pattern feature: to reduce noise
- ECG clustering: to find shared cardiac patterns and common morphological attributes among population
- Bayesian search (0 -> 10): to reduce the searching space and attempt efforts, speeding up the search
- Template ranking search (10 -> 100): to find more substitute samples without further accessing the TAS
- Substitute Sample Synthesis (Training data augmentation, 100 -> 1000)
- Curve Fitting & Autoencoder: to generate more synthesized samples
- One-class SVM (Training an off-line authentication system): to guarantee the quality of training data without accessing the TAS
- ECG Counterfeits Generation (1000 -> infinite)
- Variational autoencoder (VAE)
- Generative adversarial network (GAN)
Evaluation Settings
- Database: UofTDB ECG biometric database
- Retained 606 subjects with good signal quality after filtering
- Randomly divided into attacking set (400 subjects) and defending set (206 subjects)
- Randomly selected 5 subjects as genuine users (with EER ranging from 1% to 5%)
Performance
- TAS performance: robust against replay attack, achieving 0% false acceptance rate (FAR) on white noise injection, with SNR ranging from 0 to 50. In other words, no matter how much noise was injected, the TAS can always identify and reject replay attacking samples.
- Generative model performance:
- VAE: 95% attacking success rate (TAS FAR)
- GAN: 99% attacking success rate (TAS FAR)
The Severe Issue Revealed
This project reveals the potential security risks introduced by the availability of public databases of all kinds. It also calls on new ways to defend against thus threat.
Acknowledgement
This work is supported by the National Science Foundation (NSF).