Prediction of voice aperiodicity based on spectral representations in HMM speech synthesis

Silén, Hanna; Helander, Elina; Gabbouj, Moncef
Abstract

In hidden Markov model-based speech synthesis, speech is typically parameterized using source-filter decomposition. A widely used analysis/synthesis framework, STRAIGHT, decomposes the speech waveform into a framewise spectral envelope and a mixed mode excitation signal. Inclusion of an aperiodicity measure in the model enables synthesis also for signals that are not purely voiced or unvoiced. In the traditional approach employing hidden Markov modeling and decision tree-based clustering, the connection between speech spectrum and aperiodicities is not taken into account. In this paper, we take advantage of this dependency and predict voice aperiodicities afterwards based on synthetic spectral representations. The evaluations carried out for English data confirm that the proposed approach is able to provide prediction accuracy that is comparable to the traditional approach.

Keywords

aperiodicity prediction; hidden Markov model; speech synthesis

Research areas

Year:
2011
Book title:
Interspeech
Pages:
105 - 108
Address:
Florence, Italy
Month:
August
Note:
Demo available at: http://www.cs.tut.fi/sgn/arg/silen/is2011/AperiodicityPrediction.html