Turkish Coreference Resolution


Pamay T., Eryiğit G.

IEEE (SMC) International Conference on Innovations in Intelligent Systems and Applications (INISTA), Thessaloniki, Yunanistan, 3 - 05 Temmuz 2018 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: Thessaloniki
  • Basıldığı Ülke: Yunanistan
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

This paper presents the state-of-the-art results in Turkish coreference resolution (CR) which is a task of determining sets of mentions which identify the same real-world entity (e.g. a person, a place, a thing, an event). The proposed system uses support vector machines and solves the CR task with a mention pair model that basically accepts mention couples and decides on whether they are coreferential with each other or not. The results are evaluated on Marmara Turkish Coreference Corpus by using well-known evaluation metrics (viz. MUC, B-3, BLANC and LEA). The introduced approach obtains F1 scores of 90.68% (MUC), 86.89% (B-3), 85.13% (BLANC) and 78.34% (LEA) yielding an improvement of 9.12, 16.06, 13.08 and 12.57 percentage points respectively over a recent baseline system on Turkish CR. The paper introduces the system setup (SVM parameters and negative sampling strategy) as well as the selected features and analyzes the impact of these features on the Turkish CR task.