Meelen, Marieke, and David Willis, “Towards a historical treebank of Middle and Early Modern Welsh, part I: workflow and POS tagging”, Journal of Celtic Linguistics 22 (2021): 125–154.  

This article introduces the working methods of the Parsed Historical Corpus of the Welsh Language (PARSHCWL). The corpus is designed to provide researchers with a tool for automatic exhaustive extraction of instances of grammatical structures from Middle and Modern Welsh texts in a way comparable to similar tools that already exist for various European languages. The major features of the corpus are outlined, along with the overall architecture of the workflow needed for a team of researchers to produce it. In this paper, the two first stages of the process, namely pre-processing of texts and automated part-of-speech (POS) tagging are discussed in some detail, focusing in particular on major issues involved in defining word boundaries and in defining a robust and useful tagset.

Meelen, Marieke, “Annotating Middle Welsh: POS tagging and chunk-parsing a corpus of native prose”, in: Lash, Elliott, Fangzhe Qiu, and David Stifter (eds), Morphosyntactic variation in medieval Celtic languages: corpus-based approaches, Trends in Linguistics. Studies and Monographs 346, Berlin, Online: De Gruyter Mouton, 2020. 27–48.
Meelen, Marieke, and Silva Nurmio, “Adjectival agreement in Middle and Early Modern Welsh native and translated prose”, Journal of Celtic Linguistics 21 (2020): 1–28.  
This paper investigates adjectival agreement in a group of Middle Welsh native prose texts and a sample of translations from around the end of the Middle Welsh period and the beginning of the Early Modern period. It presents a new methodology, employing tagged historical corpora allowing for consistent linguistic comparison. The adjectival agreement case study tests a hypothesis regarding position and function of adjectives in Middle Welsh, as well as specific semantic groups of adjectives, such as colours or related modifiers. The systematic analysis using an annotated corpus reveals that there are interesting differences between native and translated texts, as well as between individual texts. However, zooming in on our adjectival agreement case study, we conclude that these differences do not correspond to many of our hypotheses or assumptions about how certain texts group together. In particular, no clear split into native and translated texts emerged between the texts in our corpus. This paper thus shows interesting results for both (historical) linguists, especially those working on agreement, and scholars of medieval Celtic philology and translation texts.
Meelen, Marieke, “Object-initial word order in Middle Welsh narrative prose”, in: Poppe, Erich, Karin Stüber, and Paul Widmer (eds), Referential properties and their impact on the syntax of Insular Celtic languages, Studien und Texte zur Keltologie 14, Münster: Nodus Publikationen, 2017. 145–178.
Meelen, Marieke, “Why Jesus and Job spoke bad Welsh: the origin and distribution of V2 orders in Middle Welsh”, PhD thesis: Leiden University, 2016.  
This thesis covers a wide range of topics from historical to computational and corpus linguistics as well as synchronic and diachronic syntax and information structure. The latest insights in each of these sub-fields of linguistics are necessary to address what has been a vexed problem in the study of Middle Welsh for a long time. Middle Welsh word order is particularly puzzling, because there is a wide range of verb-second patterns and the distribution of those is not at all clear. Secondly, these so-called 'Abnormal Orders' are only found in the Middle Welsh period; Old and Modern Welsh mainly exhibit verb-initial patterns. Verb-second orders are shown to have developed from earlier patterns with hanging topics and focussed cleft constructions by carefully reconstructing their syntactic history in Old Welsh and related Celtic languages. A detailed analysis of a syntactically and pragmatically annotated corpus, built especially for this thesis, reveals that a combination of these features explains which word-order pattern appears in which particular context. From a diachronic syntactic point of view, Middle Welsh shares some crucial developments in the rise of V2 with Early Romance, but it differs in others.
Zimmer, Stefan, “Nog een Indo-Keltische parallel: Iers cethrochair, etc.”, tr. Marieke Meelen, Kelten: Mededelingen van de Stichting A. G. van Hamel voor Keltische Studies 52 (November, 2011): 5–7.


