Times are given in Irish Standard Time (IST), i.e., either UTC+0 or UTC+1 (Daylight Saving).

Loading Events

« All Events

  • This event has passed.

Cardamom Seminar Series #22 – Dr Mark Faulkner (Trinity College Dublin)

May 29, 2023 @ 5:00 pm 6:00 pm IST

Towards Medieval Big Data: corpora, metadata and methodologies for early English

The Unit for Linguistic Data at the Insight SFI Research Centre for Data Analytics Data Science InstituteUniversity of Galway, is delighted to welcome Mark Faulkner, an assistant professor at Trinity College, Dublin as the next speaker in our seminar series. He will talk about the importance of handwritten text recognition in building new corpora focused on old English. Register here.


Philology, the science of written texts, has been criticised by Jenset and McGillivray (2017) as an example-based approach, where conclusions are based on a handful of forms collected arbitrarily by a human reader. This paper explores how new technologies can help respond to their call for a more data-driven approach. Its focus is early English (ca. 600-1300), but the approaches will be applicable to any pre-modern language. In particular, the paper discusses possibilities for using Handwritten Text Recognition to build new, richer corpora; the use of linked data to enrich corpus metadata (based on the speaker’s IRC-Coalesce-funded project, Searobend: Linked Metadata for English Language Texts, 1000-1300); and new methods for the semi-automated extraction of large datasets with which to rewrite the grammars of early English and for identifying linguistic similarities between texts that can help solve questions of authorship attribution or dating.

About the Speaker:

Mark Faulkner is Ussher Assistant Professor of Medieval Literature at Trinity College, Dublin and PI of the IRC-Coalesce-funded Searobend: Linked Metadata for English Language Texts, 1000-1300, a collaboration with linked data expert Prof. Declan O’Sullivan. His New Literary History of the Long Twelfth Century: Language and Literature between Old and Middle English was published by Cambridge University Press in 2022 and has been described by reviewers as a milestone in the study of early English’ and ‘far and away the best study of the period to date’. He developed the M. Phil in Medieval Studies at Trinity, directing it for its first three years, and is now the inaugural director of the Trinity Centre for the Book. He was elected a fellow of Trinity College Dublin in 2023.


The seminar series is led by the Cardamom project team. The Cardamom project aims to close the resource gap for minority and under-resourced languages using deep-learning-based natural language processing (NLP) and exploiting similarities of closely related languages. The project further extends this idea to historical languages, which can be considered closely related to their modern form. It aims to provide NLP through both space and time for languages that current approaches have ignored.

Registration link: