MMTD - Million Musical Tweets Dataset
Corpus of Music Microblogging Behavior (2011-11-09 to 2013-04-30)
MMTD - Million Musical Tweets Dataset (11.2011 - 4.2013): Music Listening Behavior

This page contains the MMTD, the Million Musical Tweets dataset, a paper on which was accepted for ISMIR 2013.

The data set contains listening histories inferred from microblogs. Each listening event identified via twitter-id and user-id is annotated with temporal (date, time, weekday, timezone), spatial (longitude, latitude, continent, country, county, state, city), and contextual (information on the country) information. In addition, pointers to artist and track are provided as a matter of course. Moreover, the data includes references to other music-related platforms (musicbrainz, 7digital, amazon).

In case you make use of the data set in your own research, please cite the corresponding paper:

The Million Musical Tweets Dataset - What We Can Learn From Microblogs
Hauger, D. and Schedl, M. and Košir, A. and Tkalčič, M.
Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR 2013), Curitiba, Brazil, November 2013.

You can either download the entire dataset (, 128 MB) or each component separately, below.
All files are tab-separated and zipped text files, including column names in the first row.

Components of the Data Set <artistId musicbrainzArtistId artistName> 15 MB<countryCode countryName isoAlpha3 fipsCode continent continentName capital areaInSqKm population currencyCode languages west north east south> 14 KB <location_id latitude longitude country state county city postalCode street timezone>3.7 MB <trackId trackTitle artistId>1.4 MB <tweetId twitterTweetId twitterUserId artistId trackId datetime weekday longitude latitude> 28 MB <trackId amazon_asin amazon_albumAsin>4.7 MB <trackId 7digitalTrackId>185 KB

