HMM-Based Text-to-Speech Synthesis & Stressed Speech Processing

Pradnya Prakash Wagh, Swati Warungase, Neha Ashok Aringale, Nadeem Bhaimiya Shaikh


According to this paper, the new system produces synthetic speech that is noticeably higher-quality than
speaker-dependent systems when actual speech data sets are used, and it can compete with speaker-dependent approaches
even in situations when substantial speech data sets are available. This excitation signal, the glottal source, has naturally
piqued the interest of speech synthesis, and a variety of techniques have been developed to mimic the glottal source of
spontaneous speech.The use of artificial models for the glottal source has improved the synthesis's quality. However, the
current models also oversimplify the glottal source, which has resulted in inadequate synthesis quality. Using glottal inverse
filtering to recover glottal flow pulses from natural speech has been proposed as a solution to problems arising from
simplistic glottal source models. However, previous work with glottal flow pulses extracted from real speech is limited to
certain applications, such as vowel isolation, and the benefits of combining automatic glottal inverse filtering with an HMM-
based speech synthesizer have not been explored. Furthermore, a comparative analysis using many speech synthesis
methods demonstrates how reliable the new approach is: Even for sentences that are outside of its area, it can create voices
from less-than-ideal speech data and synthesize high-quality speech.

