Discovering Web Music using Signal Analysis

These mp3z have been grouped together based upon their acoustic similarity.


We never looked at the file names or ID3 tags, only the audio signal itself.


While imperfect, this technique shows great potential for working with user generated content.


The first track is the seed, the rest are in order of similarity to the first.


Check out these playlists of mp3s that are similar to:


These recommendations were created by analyzing the audio signal of each track in our collection. No metadata or ratings information was used in this analysis. We looked at playlists from the 500 most popular playlist authors at Webjay and analyzed over 25000 audio tracks using MARSYAS, an open source software framework for audio processing.

For each track, we downsampled the audio signal and converted a 30 second chunk into 24 different audio features.

Recommendations are based upon a "source track", which is the first track which appears in the playlist. The tracks that follow it are the closest vectors in the audio feature space.

Notes:

If a song appears twice, they are actually different media files. It makes sense to see the same songs close to each other as they should be acoustically similar!

If a track doesn't load, it's probably been removed since this data was originally collected over 6 months ago. Such is the world of web media...

If a track's name is "UNDEFINED" - that's because there is no metadata available for it. However, we are still able to recognize this track and recommend it because we don't require any metadata for our analysis.

Yahoo! Music Research 2006

This demo is an application of research by Malcolm Slaney and William White and builds upon the work of George Tzanetakis.