Skip to content

Repository for my Master's Thesis on syntactic complexity of Russian and English as foreign languages

Notifications You must be signed in to change notification settings

eaklykova/syntactic_complexity

Repository files navigation

Syntactic Complexity of Written Texts in Russian and English as Foreign Languages

Master's Thesis (HSE)

Elizaveta Klykova, 2024

This study offers a comprehensive perspective of syntactic complexity in English and Russian texts written by L1 and L2 speakers. We analyze 20 syntactic complexity measures pertaining to the sentential, clausal, and phrasal levels, and explore their interrelationships, correlation with proficiency, task type, and genre. We propose a new measure of syntactiс complexity based on Levenshtein distance at the clausal level. Our findings reveal strong correlations among length-based measures and highlight the problematic nature of the Coordination Index commonly used in the literature. We also find support for the idea that complexity generally increases with proficiency, with some measures plateauing at advanced levels. Syntactic complexity measures can also reliably distinguish between texts of different genres and task types; some values are language-specific, differing in the two languages considered. Despite the challenging nature of our data, some complexity features, namely length-based indices and phrasal complexity measures, are useful in the task of automatic proficiency prediction. As a practical application of our research, we introduce syntaxcomp, a Python library for extracting syntactic complexity measures from CoNLL-U annotations.

Klykova, E. A. (2024). Syntactic Complexity of Written Texts in Russian and English as Foreign Languages [Master's Thesis, Higher School of Economics]. https://www.hse.ru/en/edu/vkr/931188956

About

Repository for my Master's Thesis on syntactic complexity of Russian and English as foreign languages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published