A Review Of Deep Learning Applications For Speech Processing Improvement
Abstract
Improve the quality of the spoken word is a common goal for many audio and speech signal processing applications. A noisy voice signal's quality and understandability may be improved via speech augmentation. Speech augmentation is critical in a wide range of fields, including hearing aids, ASR, and mobile communication. DNN-based architectures for speech recognition and augmentation have shown to be quite effective in recent years, according to a new study. In the actual world, where many disturbances may concurrently contaminate speech, we examine the issue of speech improvement in this study. Current research on speech improvement focuses mostly on the existence of single noise in damaged speech, which is far off from the real-world situations that really exist. In particular, we are concerned with enhancing the clarity of workplace speech in which several stationary and non-stationary sounds may be present at the same time. In some circumstances, Deep Neural Networks (DNN) may be utilised to enhance speech. For the improvement of loud speech, we also look at DNN training using psychoacoustic models from speech coding.
Keywords
Full Text:
PDFReferences
Park, Gyuseok, Woohyeong Cho, Kyu-Sung Kim, and Sangmin Lee. ”Speech enhancementfor hearing aids with deep learning on environmental noises.” Applied Sciences 10, no. 17(2020): 6077.
Saleem, Nasir, and Muhammad Irfan Khattak. ”Deep Neural Networks for Speech Enhancement in Complex-Noisy Environments.” Int. J. Interact. Multim. Artif. Intell. 6, no. 1 (2020):84-90.
Wang, Yiting, and Zhenhua Wei. ”Research on speech enhancement based on deep neural network.” In Journal of Physics: Conference Series, vol. 1650, no. 3, p. 032163. IOPPublishing, 2020
Yuan wenhao, Lou yingxi, liang chunyan, xia bin. Improving the generalization ability ofspeech enhancement by generating noise [J]. Acta electronica sinica,2019,47(04):791-797.
Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. In Proceedings ofthe Annual Conference on Neural Information Processing Systems, Montreal, QB, Canada, 8–13 December2014; pp. 3104–3112.
El Bouchti, A., Chakroun, A., Abbar, H., & Okar, C. (2017). Fraud detection in banking using deep rein-forcement learning. Seventh International Conference on Innovative Computing Technology (INTECH), 58-63. 10.1109/INTECH.2017.8102446
Emad, O., Yassine, I. A., & Fahmy, A. S. (2015). Automatic localization of the left ventricle in cardiac mri images using deep learning. Engineering in Medicine and Biology Society (EMBC), 2015 37th An-nual International Conference of the IEEE, 683–686. 10.1109/EMBC.2015.7318454
Jaitly, N., & Hinton, G. (2011). Learning a better representation of speech soundwaves using restricted Boltzmann machines. Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, 5884–5887. 10.1109/ICASSP.2011.5947700
Refbacks
- There are currently no refbacks.
Copyright (c) 2022 Journal of Telecommunication, Switching Systems and Networks