Forecasting US movies box office performances in Turkey using machine learning algorithms

Cagliyora S., Oztaysi B., Sezgin S.

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, vol.39, no.5, pp.6579-6590, 2020 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 39 Issue: 5
  • Publication Date: 2020
  • Doi Number: 10.3233/jifs-189120
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, zbMATH, Civil Engineering Abstracts
  • Page Numbers: pp.6579-6590
  • Istanbul Technical University Affiliated: Yes


The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.