On Utilizing Optimal and Information Theoretic Syntactic Modeling for Peptide Classification


Aygun E., Oommen B. J., Çataltepe Z.

4th International Conference Pattern Recognition in Bioinformatics, Sheffield, Birleşik Krallık, 7 - 09 Eylül 2009, cilt.5780, ss.24-25 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 5780
  • Basıldığı Şehir: Sheffield
  • Basıldığı Ülke: Birleşik Krallık
  • Sayfa Sayıları: ss.24-25
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

Syntactic methods in pattern recognition have been used extensively in bioinformatics, and in particular, in the analysis of gene and protein expressions, and in the recognition and classification of biosequences, These methods are almost universally distance-based. This paper concerns the use of an Optimal and Information Theoretic (OIT) probabilistic model [11] to achieve peptide classification using the information residing in their syntactic representations. The latter has traditionally been achieved using the edit distances required in the respective peptide comparisons. We advocate that, one can model the differences between compared strings as a mutation model consisting of random Substitutions, Insertions and Deletions (SID) obeying the OIT model. Thus, in this paper, we show that the probability measure obtained. from the OIT model can be perceived as a sequence similarity metric, using which a Support Vector Machine (SVM)-based peptide classifier, referred to as OIT-SVM, can be devised.