Audio Signal Modeling with Sinusoids Plus Noise

Virtanen, Tuomas

In audio signal spectrum modeling, the aim is to transform a signal to a more easily applicable form, removing the information that is irrelevant in signal perception. Sinusoids plus noise model is a spectral model, in which the periodic components of the sound are represented with sinusoids with time-varying frequencies, amplitudes and phases. The remaining non-periodic components are represented with a filtered noise. The sinusoidal model utilizes the physical properties of musical instruments and the noise model the humans’ inability to perceive the exact spectral shape or phase of stochastic signals. In the case of polyphonic music signals, the estimation of the parameters of sinusoids is a difficult task, since the periodic components are usually not stable. A sufficient time and frequency resolution is also difficult to achieve at the same time. A big part of this thesis discusses the detection and parameter estimation of periodic components with several algorithms. In addition to already existing algorithms, a new iterative algorithm is presented, which is based on the fusion of closely spaced sinusoids. The sinusoidal model is applied in the separation of overlapping sounds and manipulation. In the sound separation, a new perceptual distance measure between sinusoids is used. The perceptual distance measure is based on the humans’ way to associate spectral components into sound sources. Also a new separation method based on the multipitch estimation is explained. The modification of the pitch and time scale of sounds with the sinusoid plus noise model without affecting the quality of the sound is explained shortly, too.

Research areas