Peeling Analysis for Sadness and Happiness using Google n-gram Database

Donmez I., Sonmez E. B.

3rd International Conference on Computer Science and Engineering (UBMK), Sarajevo, Bosnia And Herzegovina, 20 - 23 September 2018, pp.56-60 identifier

  • Publication Type: Conference Paper / Full Text
  • City: Sarajevo
  • Country: Bosnia And Herzegovina
  • Page Numbers: pp.56-60


The current era has been defined as "Digital Age" and "Information Age" since it is characterized by an exponential grow of data, generated by both human, i.e. social environments, and machine, i.e. Internet of things. The challenge is to convert "data" into "information", by analyzing the data and discovering patterns hidden inside it. In this paper the two basic human feelings of Happiness and Sadness are extracted from a subset of Google n-grams corpus and analyzed. Google n-grams corpus is generated from millions of scanned books published between year 1500 and 2008; it can be considered as an indicator for human specific feature and behavior. Under the hypothesis that user's emotion can he extrapolated by the frequency of the corresponding emotional words, this study applies regression to predict the importance of the Happiness and Sadness emotional states in future years.