cp.jku - Web & Social Media

Department of

Computational Perception

Home – Mission – Teaching – People – Research – Media – Awards – Impressum

WEB & SOCIAL MEDIA

<< back to overview

We are actively pursuing research related to Web and Social Media Mining and Analysis, exploiting resources such as web pages, blogs, YouTube, Facebook, Twitter, and Last.fm. Many of the following topics are strongly connected to music and multimedia applications. However, we are also open to more general topics, involving web crawling, information retrieval, and information extraction.
Topics might also be worked on during or as an extension to our classes on Learning from User-generated Data, Multimedia Data Mining, and Social Media Mining and Analysis.

Analysis and Prediction of User Traits and Behavior

This group of topics focuses on analyzing connections between social media data and characteristics of the users as well as predicting these characteristics by applying machine learning techniques. Features extracted from social media data may include metadata about the user (e.g., number of tweets per day or number of friends) or aspects extracted via natural language processing. Also shared multimedia material (e.g., images and video) or sensor data (e.g., from smart phones) can be analyzed and incorporated. User characteristics can be of very different nature, for instance, behavioral or consumption patterns, taste (music, movies, web pages, etc.), age, gender, location, or even personality traits. For analysis and prediction, methods from statistics and machine learning, (e.g., correlation analysis and classification, respectively), should be investigated and their performane thoroughly evaluated.

References:

Private traits and attributes are predictable from digital records of human behavior, Kosinski, M., Stillwell, D., Graepel, T., Proceedings of the National Academy of Sciences (PNAS) 110(15), 5802–5805, 2013.
You Are Where You Tweet: A Content-based Approach to Geo-locating Twitter Users, Cheng, Z., Caverlee, J., Lee, K., Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM), Toronto, Canada, 2010.
"I’m Eating a Sandwich in Glasgow": Modeling Locations with Tweets, Kinsella, S., Murdock, V., O’Hare, N., Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents (SMUC), Glasgow, UK, 2011.
Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach, Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., et al., PloS one 8(9), e73791, 2013.
Gender Identification and Age Estimation of Users Based on Music Metadata, Wu, M.-J., Jang, J.-S. R., Lu, C.-H., Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), Taipei, Taiwan, 2014.

Contact: Markus Schedl, Peter Knees

Trend and Popularity Prediction

Usage of social media has experienced an incredible increase during the last couple of years. Nowadays, people create, share, consume, and comment on all kinds of multimedia items through online social networks and platforms. Predicting whether a particular item will become popular or not is a hot topic both in academia and industry. Possible topics for practical projects and Master's theses in this context include elaborating new techniques to harvest a variety of data sources (e.g., multimedia content descriptors, microblogs, social network structure, or consumption histories), researching models that describe the popularity of items (songs, videos, persons, scientific publications, etc.), designing computational features that will serve as predictors for future popularity in machine learning algorithms, and evaluating them in comprehensive experiments. Another related topic is the analysis and prediction of "popularity flows", i.e., where trends emerge and how they spread temporally and spatially.

References:

Predicting the Popularity of Online Content, Gabor Szabo and Bernardo A. Huberman, Communications of the ACM, 53(8):80-88, August 2010.
On Popularity Prediction of Videos Shared in Online Social Networks, Haitao Li, Xiaoqiang Ma, Feng Wang Jiangchuan Liu, and Ke Xu, Proceedings of the 22nd ACM International Conference on Conference on Information and Knowledge Management (CIKM), San Francisco, CA, USA, 2013.
Twitter-driven YouTube Views: Beyond Individual Influencers, Honglin Yu, Lexing Xie, Scott Sanner, Proceedings of th 22nd ACM International Conference on Multimedia (ACM MM), Orlando, FL, USA, 2014.
The Lifecycle of a Youtube Video: Phases, Content and Popularity, Honglin Yu, Lexing Xie, Scott Sanner, Proceedings of the 9th International AAAI Conference on Weblogs and Social Media (ICWSM), Oxford, UK, 2015.
Will This Paper Increase Your h-index? Scientific Impact Prediction, Yuxiao Dong, Reid A. Johnson, and Nitesh V. Chawla, Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM), Shanghai, China, 2015.

Contact: Markus Schedl

Opinion Mining and Sentiment Analysis

"Sentiment analysis or opinion mining refers to the application of natural language processing, computational linguistics, and text analytics to identify and extract subjective information in source materials. Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgement or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader)." [Wikipedia]

Examples for projects:

Comment and posting analysis
Automatic rating prediction
Controversy detection / detecting bipolar sentiments
Spam-comment filtering
Affect prediction from multimedia content (e.g., speech or video)

Contact: Marcin Skowron, Markus Schedl, Peter Knees

Recommendation and Discovery

Exploiting the web and social media, one can identify many potential sources for discovery and recommendation of similar items (music pieces, movies, news, etc.). While topics related to all kinds of multimedia material can be worked on, in the context of music, similar artists or songs are of highest interest. Sources to be considered include word patterns on music pages, user-generated tags, ratings, tweets, search engine page counts, peer-to-peer networks, or playlists. Mining these sources can not only give information on the artistic context of music and help estimating similarity, but also to defer measures and indicators such as popularity, hotness, novelty, diversity, and genre prototypicality. In this context, topics related to user modeling and personalized as well as user-aware retrieval and recommendation can be considered, too.

Examples for projects:

Multimedia recommender systems
User modeling and user-aware retrieval and recommendation
Analyzing and preventing hubness in recommender systems
Implementation and comparison of different similarity measures
Automatic playlist generation
Taste models for music (or multimedia) consumption behavior (diversity, mainstreaminess, openness to new content, etc.)

References:

A Survey of Music Similarity and Recommendation from Music Context Data, Knees, P. and Schedl, M., ACM Transactions on Multimedia Computing, Communications and Applications, volume 10, issue 1, 2014.
Local and Global Scaling Reduce Hubs in Space, Schnitzer, D., Flexer, A., Schedl, M., and Widmer, G., Journal of Machine Learning Research, volume 13, October 2012.
Tailoring Music Recommendations to Users by Considering Diversity, Mainstreaminess, and Novelty, Schedl, M., and Hauger, D., Proceedings of the 38th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Santiago, Chile, 2015.

Contact: Peter Knees, Markus Schedl

last edited by ms on Sep 24, 2015