Mining similar radiology reports using BoW and Fuzzy C-means clustering


Türkeli S. , Akkoca Gazioğlu B. S. , Kurt K. K. , Atay H. T. , Gorur Y.

2017 International Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, Turkey, 16 - 17 September 2017 identifier

  • Publication Type: Conference Paper / Full Text
  • City: Malatya
  • Country: Turkey

Abstract

Finding similar diagnoses for the same region are vital for patients. In this paper, we aim to find the similarity radiology reports based on bag-of-words (BoW) and Fuzzy C-Means Clustering methods. A double-layer structure is applied. Firstly, extracting features from data BoW method is applied and then Fuzzy C-Means algorithm is performed to cluster the blocks into the similar cluster and the non-similar cluster. 457 radiology reports were examined which were collected from a research and education hospital in Istanbul. Data were tested according to the 23 regions and 137 diagnosis. By the opinion of the radiologist a vocabulary consists of these regions and diagnosis were created. Experimental results on data sets have shown that for the standard documents BoW and Fuzzy C-Means Clustering can be used to find similarity.