L2L: From Lemmatizers to Linkers
Milano, 2022 - 2023
The goal of the L2L project is to create a new CLARIN Resource Family of interoperable resources based on the Linguistic Linked Open Data paradigm. To this end, existing LOD-capable resources already integrated in CLARIN are reviewed, and the model of the “LiLa-Linking Latin” project (http://lila-erc.eu) is expanded. The focus is on interoperability between lemmatizers, corpora, and lexicons. The first step of L2L will be to list all the LOD-compliant resources available within CLARIN’s VLO and Switchboard that satisfy at least the first 4 points in Berners-Lee’s 5-star classification and are potentially ready to be included in the Resource Family. Step two is to integrate into the CLARIN Switchboard the LiLa’s Text Linker, a lemmatizer of Latin that generates RDF output where tokens are linked to the LiLa’s lemma collection. Finally, an extension of the LiLa model for a sample language (Italian) will be tested, using only CLARIN’s resources. The aim is to set up a prototype service that leverages resources in the Switchboard and VLO to generate interoperable data fitting the requirements of the Resource Family.
At the end of L2L, the creation of a CLARIN Knowledge Center dedicated to interoperable, LLOD-compliant language resources will be promoted.
Working group:
- Marco Carlo Passarotti - Director of CIRCSE
- Francesco Mambrini
- Giovanni Moretti
Sede: Milano
Area Scientifica: scienze dell’antichità, filologico-letterarie e storico-artistiche
Responsabile scientifico: Francesco Mambrini
Periodo di svolgimento della ricerca: 2022 - 2023