Streaming Linear Regression on Spark MLlib and MOA

Akgun B., Öğüdücü Ş.

IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, France, 25 - 28 August 2015, pp.1244-1247 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1145/2808797.2809374
  • City: Paris
  • Country: France
  • Page Numbers: pp.1244-1247


In recent years, analyzing data streams has attracted considerable attention in different fields of computer science. In this paper, two different frameworks, namely MOA and Spark MLlib, are examined for linear regression on streaming data. The focus is placed on determining how well the linear regression techniques implemented in the frameworks that could be used to model the data streams. We also examine the challenges of massive data streams and how MOA and Spark Streaming solve these kinds of challenges. As a result of the experiments, we see that although the usage of MOA is more easier than Spark MLlib, Spark MLlib linear regression performance on streaming data is better.