|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- el |
|
|
pipeline_tag: token-classification |
|
|
library_name: stanza |
|
|
tags: |
|
|
- Greek dialect |
|
|
--- |
|
|
|
|
|
# ποΈ UD East Cretan Dialect Treebank |
|
|
|
|
|
The **East Cretan dialect** is a variety of **Modern Greek** primarily used on the island of **Crete** and by the **Cretan diaspora**, including communities relocated to **Hamidieh in Syria** and **Western Asia Minor** following the 1923 population exchange. The dialect has been shaped by the island's long-term isolation and successive domination by **Arabs, Venetians, and Turks**, resulting in distinct phonological, morphological, and lexical characteristics. |
|
|
|
|
|
East Cretan is divided into **western and eastern subgroups**, with the boundary roughly coinciding with the prefectures of **Rethymno** and **Heraklion**. The **eastern group** is more homogeneous, while the western shows more variation. Unlike many Modern Greek dialects, **East Cretan remains actively spoken**, serving as the main means of communication in much of the island. |
|
|
|
|
|
--- |
|
|
|
|
|
## π£οΈ Dataset Summary |
|
|
|
|
|
This model was trained on the **6th round of the East Cretan dataset**, which includes: |
|
|
|
|
|
- **180 training sentences** (2,976 tokens) |
|
|
- **60 development sentences** (1,129 tokens) |
|
|
- **30 test sentences** (523 tokens) |
|
|
|
|
|
Annotations follow the **Universal Dependencies v2 schema** for morphological, syntactic, and lemmatization layers. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Model Performance |
|
|
|
|
|
| **Metric** | **Accuracy (%)** | |
|
|
|------------|----------------:| |
|
|
| UPOS | 92.90 | |
|
|
| XPOS | 89.45 | |
|
|
| UFeats | 85.60 | |
|
|
| AllTags | 77.48 | |
|
|
| Lemmas | 88.44 | |
|
|
| UAS | 85.40 | |
|
|
| LAS | 78.30 | |
|
|
| CLAS | 72.76 | |
|
|
| MLAS | 57.09 | |
|
|
| BLEX | 61.57 | |
|
|
| ELAS | 0.00 | |
|
|
| EULAS | 0.00 | |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
To cite this work or read more about the training pipeline, see: |
|
|
|
|
|
Socrates Vakirtzian, Vivian Stamou, Yannis Kazos, Stella Markantonatou. (2024). **Dialectal treebanks and their relation with the standard variety: The case of East Cretan and Standard Modern Greek.** The Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), Tallinn, Estonia, March 2β5, 2025. |
|
|
|