Ensemble Learning Based Stock Market Prediction Enhanced with Sentiment Analysis

Sivri M. S., Üstündağ A., Korkmaz B. S.

International Conference on Intelligent and Fuzzy Systems, INFUS 2021, İstanbul, Turkey, 24 - 26 August 2021, vol.308, pp.446-454 identifier

  • Publication Type: Conference Paper / Full Text
  • Volume: 308
  • Doi Number: 10.1007/978-3-030-85577-2_53
  • City: İstanbul
  • Country: Turkey
  • Page Numbers: pp.446-454
  • Keywords: Ensemble learning, Feature selection, Sentiment analysis, Stock Market Prediction
  • Istanbul Technical University Affiliated: Yes


© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.Besides technical and fundamental analysis, machine learning and sentiment analysis obtained from non-structural news and comments have been studied extensively in financial market prediction in recent years. It is still uncertain how to combine predictions from news, sentiment scores or financial data. In this study, we provide a methodology to achieve this issue. Besides the methodology, this study differs from previous studies in terms of data coverage and used models in both sentiment analysis and prediction. Our study consists of weekly predictions by ensemble learning and feature selection methods using 683 variables for stocks traded in the Borsa Istanbul 30 index. In addition, we predicted sentiment scores from news of 18 different sectors and combined both predictions with weighted normalized returns. We used Random Forests, Extreme Gradient Boosting and Light Gradient Boosting Machines of ensemble learning methods for predictions. From the parameters such as training set length, estimation methods, variable selection methods, number of variables, and the number of models in the prediction method, we took the combination that gives the best result. For sentiment scores, tests were performed using BERT, Word2Vec, XLNet and Flair methods. Then, we extracted final sentiment scores from the news. With the proposed trade system, we combined the results obtained from these financial variables and the news sentiment scores. Final results show that we achieved a better performance than both predictions made by using sentiment scores and financial data in terms of weekly return and accuracy.