Music Data Sets
This page contains data sets
frequently used in web- and social media-based music information
research as well as in multimodal music information retrieval tasks.
For the MusicMicro data set, see here. For the MusiClef
2012 data set, see here.
This is a collection of 224 artists categorized into 14 genres with a
uniform genre distribution. It was proposed in the paper:
with Web-based Data
P. Knees, E. Pampalk, and G. Widmer.
Proceedings of the 5th International
Conference on Music Information
pp. 517-524, Barcelona, Spain, October 10-14, 2004.
This is a collection of 3,000 artists, corresponding to the top-ranked last.fm artists (filtered by
occurrence in allmusic.com).
The genre assignment originates from allmusic.com (18 distinct genres, skewed genre
distribution). The paper where the data set was first used is:
Web-Based Approaches to Revealing Prototypical Music Artists in Genre
M. Schedl, P. Knees, and G. Widmer.
Proceedings of the 1st IEEE International Conference on Digital
Information Management (ICDIM'06),
Bangalore, India, December 6-8, 2006.
These are two artist collections used for microblog indexing
experiments. C111ka contains a list of 110,588 artists (without genre
information). C49ka comprises 48,800 artists, for which genre
information is available as well.
edited by ms at