LFM-1b Dataset
LFM-1b UGP Dataset
 
Corpus of Music Listening Events for Music Recommendation
CP logo JKU logo   Yahoo! PlaceFinder logo

Description

This web page hosts the LFM-1b dataset of more than one billion listening events, intended to be used for various music retrieval and recommendation tasks. A paper describing the dataset was accepted to the ACM International Conference on Multimedia Retrieval (ICMR) 2016 and is available for download. A follow-up publication including results of a more thorough statistical analysis was published in the International Journal of Multimedia Information Retrieval, available here. Please consult these publications for details.

In case you make use of the LFM-1b dataset in your own research, please cite the following paper:

The LFM-1b Dataset for Music Retrieval and Recommendation
Schedl, M.
Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR 2016), New York, USA, April 2016.

As an extension to LFM-1b, the LFM-1b User Genre Profile (LFM-1b UGP) dataset was published as a paper in the proceedings of the IEEE International Symposium on Multimedia (ISM) 2017. It uses Last.fm artist tags indexed with two dictionaries of genre and style descriptors (from Allmusic and Freebase) to create, for each user in LFM-1b, a preference profile as a vector over genres. This vector can be used as proxy for the genre/style preferences of the users. Combined with demographic information available in LFM-1b, such profiles can also be created on other levels, e.g., for countries or users in a particular age group.

In case you make use of the LFM-1b UGP dataset in your own research, please cite the following paper:

Large-scale Analysis of Group-specific Music Genre Taste From Collaborative Tags
Schedl, M. and Ferwerda, B.
Proceedings of the 19th IEEE International Symposium on Multimedia (ISM 2017), Taichung, Taiwan, December 2017.

Datasets

You can download the LFM-1b dataset here: LFM-1b.zip (~8GB).
The LFM-1b User Genre Profile (LFM-1b UGP) dataset is available here: LFM-1b_UGP.zip (143 MB).

Code

We further provide exemplary Python code that shows how to access the LFM-1b dataset and compute simple statistics:

LFM-1b_stats.py Code to load the user-artist-matrix, compute some simple statistics, and store them in a text file.
LFM-1b_plot.py Based on the output of LFM-1b_stats.py, code to create some plots of the dataset's properties.
LFM-1b_recommend-CF.py Code that implements a simple collaborative filtering recommender.

last edited by ms on 2017-09-27