The incremental use of morphological information and lexicalization in data-driven dependency parsing

21st International Conference on Computer Processing of Oriental Languages (ICCPOL 2006), Singapore, Singapur, 17 - 19 Aralık 2006, cilt.4285, ss.498-500

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası: 4285
Basıldığı Şehir: Singapore
Basıldığı Ülke: Singapur
Sayfa Sayıları: ss.498-500
İstanbul Teknik Üniversitesi Adresli: Evet

Özet

Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features axe found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.