Abstract syntax is a concept in compilers and programming language semantics. It is a tree representation that abstracts away from the order and shape of lexical items. GF, Grammatical Framework, is a grammar formalism that applies abstract syntax to natural languages. Its initial purpose was to build domain- specific translation systems based on semantic interlinguas. GF has later scaled up to wide-coverage grammars, based on the GF Resource Grammar Library, which applies a shared abstract syntax to 30 languages. The Universal Dependencies (UD) initiative is a more recent, but already more widely known, approach using shared concepts: the labels and part of speech tags in dependency trees. UD trees are built by parsers trained from treebanks, whereas GF uses explicit grammar rules. Thus UD uses manual work for annotating treebanks, whereas GF uses manual work for writing grammars.

We will present a conversion from GF abstract syntax trees to UD dependency trees in this talk. The conversion has several potential applications: (1) it makes the GF parser usable as a rule-based dependency parser; (2) it enables bootstrapping UD treebanks from GF treebanks; (3) it defines a formal way to assess the informal annotation schemes of UD; (4) it gives a method to check the consistency of manually annotated UD trees with respect to the annotation schemes; (5) it makes information from UD treebanks available for the construction and ranking of GF trees, which can be expected to improve GF applications such as machine translation. The conversion is tested and evaluated by bootstrapping a small treebank for 32 languages, as well as comparing the GF version of the English Penn treebank with the standard UD version.

Date: 2015-12-03 10:30 - 12:00

Location: L308, Lennart Torstenssonsgatan 8


