Sound analysis and synthesis is employed in many applications, including speed recognition, speech synthesis, sound editing and transformation, active noise reduction systems, modeling of general acoustical sources such as musical instruments and voices, sound compression and data storage. Known methods and systems decompose sound into a collection of sinusoids with varying amplitudes, frequencies, and in some cases, phases. Decomposition is typically accomplished with Fourier analysis over short time frames, followed by peak detection, interpolation, and tracking of partials of sequences of fast Fourier transform (FFT) vectors. To avoid noise and imperfections introduced by the limitations of the FFT characterization, additional steps are often applied to characterize the noisy part of the audio signal. There are recognized tradeoffs for accuracy, computational complexity, compression rate, etc. that remain an active subject of research.
A researcher from UC San Diego has developed a new audio coding method that allows efficient decomposition of audio signal into periodic and noise components. The components can be recombined after processing operations, such as compression or editing, to reconstruct a modified version of the audio signal. The sound model can be used also to store and modify clips of sounds for synthesis applications, such as concatenative synthesis of speech or music.
Specifically, this technology provides a robust and efficient sound analysis and synthesis method using high bit rate encoding of an audio signal. An audio signal is decomposed into sinusoidal, modulated sinusoidal and noise components with a comparison of two separate spectral representations. The autoregressive and minimum variance distortion-less responses (MVDR) are calculated from linear prediction coefficients in a time varying manner. The spectral envelope of spectral lines in noise is estimated from selected properties of the spectral representations. A noisality index is derived that assigns different weights to contributions of sinusoidal and noise components at every frequency. The noisality index is used to reduce the order of the AR model and to perform re-synthesis or sinusoidal and noise components.
Patent Pending
audio codecs, compression