Open Access Open Access  Restricted Access Subscription or Fee Access

Optimized Residual Neural Network for Audio Spoof Detection in Speaker Verification Systems

Madhura Inamdar, Mitali Potnis, Jagannath Nirmal

Abstract


The Automatic speaker verification system is a type of biometric technology that utilizes speech to determine if a person is an authentic user or not. Unfortunately, such systems can be susceptible to audio-spoofing attacks. The proposed work deals with this problem of Audio Spoofing by employing Residual Networks to determine if a voice signal is bonafide or not. A comparative study of Mel-frequency Cepstral coefficients (MFCC), Constant Q Cepstral Coefficients (CQCC), and Log-Magnitude Short-Time Fourier Transform (STFT) is conducted based on the Equal Error Rate metric. The study confirms that CQCC features outperform the other features considered in terms of Spoofing Detection using Residual Networks. Further, we have also optimized the Residual Network based on the evaluation of the performance of the various hyperparameters. The three features are also merged with the help of Late Fusion. A comparison is carried out between the relative performance of the proposed model and that of the state-of-art models. It can be concluded that the proposed CQCC-based and Late Fusion-based models perform better Spoofing Detection giving an EER of 6.214% in the case of the former and 5.634% for the latter, respectively.


Keywords


Audio Spoofing Detection, ASVspoof 2019 database, Residual Neural Network, MFCC, Log-magnitude STFT, CQCC

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Current Trends in Signal Processing