Department of Computational Perception
Department of
Computational Perception
Johannes Kepler Universität Linz


Home  –  Mission  –  Teaching  –  People  –  Research  –  Impressum


FRAME LEVEL AUDIO SIMILARITY - A CODEBOOK APPROACH


We provide some examples of decomposition of music signals into three signals. The reconstructed model signal gives an impression of what is captured by the song model. The positive residual signal is resynthesized from the residual spectrum, where the model spectrum underestimates the original spectrum. The negative residual signal is resynthesized from the residual spectrum, where the model spectrum overestimates the original spectrum.

Example 1
Aphrodite - London Massive (original) The original clip is approximated quite well because this clips is very percussive.
Aphrodite - London Massive (model)
Aphrodite - London Massive (positive residual)
Aphrodite - London Massive (negative residual)
Example 2
Atreyu - Bleeding Mascara (original) For this Heavy Metal song especially the positive residual is of interest. The guitar as well as the voice are very clean in the residual. Noise and percussions are captured by the model.
Atreyu - Bleeding Mascara (model)
Atreyu - Bleeding Mascara (positive residual)
Atreyu - Bleeding Mascara (negative residual)
Example 3
Mo Blaze - We Ridin (original) A hip-hop song. One can perceive some voice elements in the reconstructed signal.
Mo Blaze - We Ridin (model)
Mo Blaze - We Ridin (positive residual)
Mo Blaze - We Ridin (negative residual)
Example 4
Cynthia Jordan - Sagesse (original) This is a negative example to demonstrate how the model fails to approximate tonal elements. Since there are no percussive elements, our model stores just very little information. The positive residual is almost equal to the original sound example.
Cynthia Jordan - Sagesse (model)
Cynthia Jordan - Sagesse (positive residual)
Cynthia Jordan - Sagesse (negative residual)
Example 5
ectoplasm - fat hairs (original) A song of the genre Electronic & Dance. Especially the bass events are approximated quite, well. Listen to the positive residual signal.
ectoplasm - fat hairs (model)
ectoplasm - fat hairs (positive residual)
ectoplasm - fat hairs (negative residual)
Example 6
Metal Gear - Theme (original) The example visualized in the paper. Percussive elements starting at the end of the audio clip are far better approximated than the sinusoidal components in the first two-thirds of this audio chunk. The positive residual contains the sinusoidal parts and note how the percussive elements at the end of the positive residual clip are cut out.
Metal Gear - Theme (model)
Metal Gear - Theme (positive residual)
Metal Gear - Theme (negative residual)



last edited by editor at 2008-03-25