Syntactic Complexity of Written Texts in Russian and English as Foreign Languages

Master's Thesis (HSE)

Elizaveta Klykova, 2024

This study offers a comprehensive perspective of syntactic complexity in English and Russian texts written by L1 and L2 speakers. We analyze 20 syntactic complexity measures pertaining to the sentential, clausal, and phrasal levels, and explore their interrelationships, correlation with proficiency, task type, and genre. We propose a new measure of syntactiс complexity based on Levenshtein distance at the clausal level. Our findings reveal strong correlations among length-based measures and highlight the problematic nature of the Coordination Index commonly used in the literature. We also find support for the idea that complexity generally increases with proficiency, with some measures plateauing at advanced levels. Syntactic complexity measures can also reliably distinguish between texts of different genres and task types; some values are language-specific, differing in the two languages considered. Despite the challenging nature of our data, some complexity features, namely length-based indices and phrasal complexity measures, are useful in the task of automatic proficiency prediction. As a practical application of our research, we introduce syntaxcomp, a Python library for extracting syntactic complexity measures from CoNLL-U annotations.

Klykova, E. A. (2024). Syntactic Complexity of Written Texts in Russian and English as Foreign Languages [Master's Thesis, Higher School of Economics]. https://www.hse.ru/en/edu/vkr/931188956

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
01_data_preprocessing.ipynb		01_data_preprocessing.ipynb
02_spell_checking_annotation.ipynb		02_spell_checking_annotation.ipynb
03_complexity_measures.ipynb		03_complexity_measures.ipynb
04_stats_and_models.ipynb		04_stats_and_models.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Syntactic Complexity of Written Texts in Russian and English as Foreign Languages

Master's Thesis (HSE)

Elizaveta Klykova, 2024

About

Releases

Packages

Languages

eaklykova/syntactic_complexity

Folders and files

Latest commit

History

Repository files navigation

Syntactic Complexity of Written Texts in Russian and English as Foreign Languages

Master's Thesis (HSE)

Elizaveta Klykova, 2024

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages