Streaming Linear Regression on Spark MLlib and MOA


Akgun B., Öğüdücü Ş.

IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, Fransa, 25 - 28 Ağustos 2015, ss.1244-1247 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1145/2808797.2809374
  • Basıldığı Şehir: Paris
  • Basıldığı Ülke: Fransa
  • Sayfa Sayıları: ss.1244-1247
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

In recent years, analyzing data streams has attracted considerable attention in different fields of computer science. In this paper, two different frameworks, namely MOA and Spark MLlib, are examined for linear regression on streaming data. The focus is placed on determining how well the linear regression techniques implemented in the frameworks that could be used to model the data streams. We also examine the challenges of massive data streams and how MOA and Spark Streaming solve these kinds of challenges. As a result of the experiments, we see that although the usage of MOA is more easier than Spark MLlib, Spark MLlib linear regression performance on streaming data is better.