Open Access Open Access  Restricted Access Subscription or Fee Access

A Review Of Deep Learning Applications For Speech Processing Improvement

Mr. Kommu Naveen, Dr. R.M.S Parvathi


Improve the quality of the spoken word is a common goal for many audio and speech signal processing applications. A noisy voice signal's quality and understandability may be improved via speech augmentation. Speech augmentation is critical in a wide range of fields, including hearing aids, ASR, and mobile communication. DNN-based architectures for speech recognition and augmentation have shown to be quite effective in recent years, according to a new study. In the actual world, where many disturbances may concurrently contaminate speech, we examine the issue of speech improvement in this study. Current research on speech improvement focuses mostly on the existence of single noise in damaged speech, which is far off from the real-world situations that really exist. In particular, we are concerned with enhancing the clarity of workplace speech in which several stationary and non-stationary sounds may be present at the same time. In some circumstances, Deep Neural Networks (DNN) may be utilised to enhance speech. For the improvement of loud speech, we also look at DNN training using psychoacoustic models from speech coding.


Deep Neural Networks, speech signal processing, (EMD), Deep complex, (MSE)

Full Text:



Park, Gyuseok, Woohyeong Cho, Kyu-Sung Kim, and Sangmin Lee. ”Speech enhancementfor hearing aids with deep learning on environmental noises.” Applied Sciences 10, no. 17(2020): 6077.

Saleem, Nasir, and Muhammad Irfan Khattak. ”Deep Neural Networks for Speech Enhancement in Complex-Noisy Environments.” Int. J. Interact. Multim. Artif. Intell. 6, no. 1 (2020):84-90.

Wang, Yiting, and Zhenhua Wei. ”Research on speech enhancement based on deep neural network.” In Journal of Physics: Conference Series, vol. 1650, no. 3, p. 032163. IOPPublishing, 2020

Yuan wenhao, Lou yingxi, liang chunyan, xia bin. Improving the generalization ability ofspeech enhancement by generating noise [J]. Acta electronica sinica,2019,47(04):791-797.

Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. In Proceedings ofthe Annual Conference on Neural Information Processing Systems, Montreal, QB, Canada, 8–13 December2014; pp. 3104–3112.

El Bouchti, A., Chakroun, A., Abbar, H., & Okar, C. (2017). Fraud detection in banking using deep rein-forcement learning. Seventh International Conference on Innovative Computing Technology (INTECH), 58-63. 10.1109/INTECH.2017.8102446

Emad, O., Yassine, I. A., & Fahmy, A. S. (2015). Automatic localization of the left ventricle in cardiac mri images using deep learning. Engineering in Medicine and Biology Society (EMBC), 2015 37th An-nual International Conference of the IEEE, 683–686. 10.1109/EMBC.2015.7318454

Jaitly, N., & Hinton, G. (2011). Learning a better representation of speech soundwaves using restricted Boltzmann machines. Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, 5884–5887. 10.1109/ICASSP.2011.5947700


  • There are currently no refbacks.

Copyright (c) 2022 Journal of Telecommunication, Switching Systems and Networks