A Semi-Automatic Annotation Interface for Named Entity and Relation Annotation on Document Images

akpınar m. y., oral b., engin d., emeklgil e., arslan s., ERYİĞİT G.

2019 4th International Conference on Computer Science and Engineering (UBMK), Samsun, Turkey, Türkiye, 11 - 15 Eylül 2019

Yayın Türü: Bildiri / Tam Metin Bildiri
Doi Numarası: 10.1109/ubmk.2019.8907209
Basıldığı Şehir: Samsun, Turkey
Basıldığı Ülke: Türkiye
Anahtar Kelimeler: Optical Character Recognition, Text Processing, Deep Learning, Named Entity Recognition, Relation Extraction, Semi-Automatic Annotation
İstanbul Teknik Üniversitesi Adresli: Evet

Özet

To be able to use supervised machine learning methods in natural language processing, there is a need of labeled data in large quantities. In some cases, especially when there are multiple tasks conducted on the same data, the annotation process may become exhausting and time consuming for both the annotatore and interpreters. Thus, an effective annotation tool becomes crucial in order to both increase the annotation quality and reduce the annotation time. In this paper, a semi-automatic annotation tool, which aims to decrease the manual work and user faults, is proposed. The interface of the tool is designed in a user-friendly manner in order to ease the process. The characteristics and input/output formats of the tool is explained in detail within the paper. The effects on the speed and accuracy of the users are analyzed as well as automatic labeling accuracy with conducted performance tests. It is noted that a deep-learning model trained with a small dataset can decrease the manual entity annotation workload up to 78,43%.