A Probabilistic model for compact document topic representation


Berenyi Z., Vajk I.

9th WSEAS International Conference on Simulation, Modelling and Optimization, Budapest, Hungary, 3 - 05 September 2009, pp.322-323 identifier

  • Publication Type: Conference Paper / Full Text
  • City: Budapest
  • Country: Hungary
  • Page Numbers: pp.322-323

Abstract

When building document categorization in distributed mobile environments, feature selection methods need to be employed to have a compact representation for each document topic and to reduce noise during classification. When interaction occurs between the nodes, locally retrieved features representing the document topic and their attributes have to be shared to have a more accurate estimation of the global classifier at every node. The network traffic should be kept at a minimum to reduce costs. We propose a probabilistic model for a keyword selection method, which makes a more thorough analysis possible and can be considered as a basis when sharing information. It can be used for building up the local document topic representations incrementally ensuring minimal network traffic. The description of the probabilistic model is complemented by experimental results.