Within the NLP community, interoperability has been a major issue in the last 25 years, it has been subject of several standardization efforts, but nevertheless remains a problem partially solved at best.

Interoperability of linguistic resources involves two major aspects: Structural interoperability (annotations of different origin are represented using the same formalism) and conceptual interoperability (annotations of different origin are linked to a common vocabulary). Recently, it has been argued that both aspects can be addressed by representing linguistic resources using Semantic Web formalisms and in accordance with the Linked Data paradigm (Chiarcos et al., 2013).

In particular, the RDF data model (labeled directed multi-graphs) allows to generalize over the concept of feature structures which is underlying existing efforts to standardize corpora (ISO TC37/SC4:LAF, TEI), linguistic annotations (EAGLES, ISO TC37/SC4:ISOcat), and lexical resources (ISO TC37/SC4:LMF, TEI), thereby contributing to the interoperability between these standardization efforts.

This talk provides a general introduction into the topic and elaborates on two selected use cases:
– exploiting structural interoperability: combining annotated corpora and lexical resources
– exploiting conceptual interoperability: dealing with heterogeneous annotations in NLP pipelines


Date: 2015-02-12 10:30 - 12:00

Location: L308, Lennart Torstenssonsgatan 8


