Times are given in Irish Standard Time (IST), i.e., either UTC+0 or UTC+1 (Daylight Saving).
- This event has passed.
Cardamom Seminar Series #8 – Ekaterina Vylomova (University of Melbourne)
February 1 @ 10:00 am – 11:00 am UTC
The Unit for Linguistic Data at the Insight SFI Research Centre for Data Analytics / Data Science Institute, National University of Ireland Galway is delighted to welcome Dr Ekaterina Vylomova, a Lecturer and a Postdoctoral Fellow at the University of Melbourne, to be the next speaker in our seminar series. The title of her talk is Documenting and modelling inflectional paradigms in under-resourced languages. She will talk about the UniMorph project, which attempts to create a universal (cross-lingual) annotation schema. Register here.
This talk will present the UniMorph project, an attempt to create a universal (cross-lingual) annotation schema. UniMorph allows an inflected word from any language to be defined by its lexical meaning, typically carried by the lemma, and a bundle of universal morphological features defined by the schema. Since 2016, the UniMorph database has been gradually developed and updated with new languages, and SIGMORPHON shared tasks served as a platform to compare computational models of inflectional morphology. During 2016–2021, the shared tasks made it possible to explore the data-driven systems’ ability to learn declension and conjugation paradigms and evaluate how well they generalize across typologically diverse languages. It is essential since the elaboration of formal techniques of cross-language generalization and prediction of universal entities across related languages should provide new potential to the modelling and documentation of under-resourced languages. The talk will outline the major challenges we faced while converting the language-specific features into the UniMorph schema, especially in under-resourced languages. In addition, we will discuss typical errors made by the majority of the systems, e.g. incorrectly predicted instances due to allomorphy, form variation, misspelt words, looping effects. Finally, it will provide case studies for Russian, Tibetan, and Nen.
About the Speaker:
Dr Ekaterina Vylomova is a Lecturer and a Postdoctoral Fellow at the University of Melbourne. Her research is focused on compositionality modelling for morphology, models of inflectional and derivational morphology, linguistic typology, diachronic language models, and neural machine translation. She co-organized SIGTYP 2019 – 2021 workshops and shared tasks and the SIGMORPHON 2017 – 2021 shared tasks on morphological re-inflection.
The seminar series is led by the Cardamom project team. The Cardamom project aims to close the resource gap for minority and under-resourced languages using deep-learning-based natural language processing (NLP) and exploiting similarities of closely related languages. The project further extends this idea to historical languages, which can be considered closely related to their modern form. It aims to provide NLP through both space and time for languages that current approaches have ignored.