Extracting Topical Information of Tweets Using Hashtags


Alp Z. Z., Öğüdücü Ş.

IEEE 14th International Conference on Machine Learning and Applications ICMLA, Florida, Amerika Birleşik Devletleri, 9 - 11 Aralık 2015, ss.644-648 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/icmla.2015.73
  • Basıldığı Şehir: Florida
  • Basıldığı Ülke: Amerika Birleşik Devletleri
  • Sayfa Sayıları: ss.644-648
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

Twitter is one of the largest micro blogging web sites where users share news, their opinions, moods, recommendations by posting text messages, and it is mostly used like a news media. Since the data being shared via Twitter is vast, many researchers are focusing on extracting meaningful information with the help of information retrieval systems. Retrieving meaningful information from social media applications became important for several tasks such as sentiment analysis, detecting anomalies, and recommendation systems. Topic modeling is one of the mostly studied and hard problems in information retrieval area, and it is even more challenging to model topics when the documents are too short such as tweets. In this paper, we focus on developing an effective and efficient method to overcome this challenge of tweets being too short for topic modeling. We compare different topic modeling schemes, one of which is not studied before, based on Latent Dirichlet Allocation (LDA) that merges tweets in order to improve LDA performance. We also demonstrate our experimental results with unbiased data collection and evaluation methodologies.