Educational Corpus of the Uzbek Language and its Opportunities

Abjalova M., Adali E., Iskandarov O.

8th International Conference on Computer Science and Engineering, UBMK 2023, Burdur, Turkey, 13 - 15 September 2023, pp.590-594 identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/ubmk59864.2023.10286682
  • City: Burdur
  • Country: Turkey
  • Page Numbers: pp.590-594
  • Keywords: database, dictionary of word's frequency, educational corpus, lexicography, rule-based, stochastic method, terminological dictionary
  • Istanbul Technical University Affiliated: Yes


The educational corpus is a language corpus based on school textbooks and dictionaries and is a structural type of the Uzbek National Corpus. According to the 'Concept of the National Corpus of the Uzbek language', the educational corpus of the Uzbek language was created in the framework of the practical project AM-FZ-201908172 'Creation of the educational corpus of the Uzbek language', the first presentation was held on April 23, 2021. Since then its database has constantly been enriched and developed. It is well-known that language corpus is an important tool in language education the process that involves the creation of dictionaries, as well as various research, diachronic and synchronous study of language, the development of speech competence, vocabulary as well speech patterns. The creation of the educational corpus is a unique technological tool in the study of the native language and its use as a foreign one. Therefore, the educational corpus of the Uzbek language is the first major stage of the 'Concept of creating the National Corpus of the Uzbek language'. This article discusses the factors, principles, models, databases, architecture, information systems, and opportunities for creating an educational corpus of the Uzbek language.