Automatic diachronic linking of Swedish lexical resources .
Språkbanken <http://språkbanken.gu.se> possesses a number of digitized lexical resources in various stages of preparation and representing i.a. various historical forms of Swedish. One of the resources – SALDO – is singled out as the pivot resource to which all the others should be linked in some way. The hope is that the resulting interlinking of the lexicons will enable many kinds of linguistic information to be transferred among them. However, the interlinking of the lexical resources has only begun and there is much scope for innovation.
This problem is an open one and should be suitably narrowed down to be solvable in the framework of a master’s thesis, e.g., by focusing on one pair of lexical resources but of course with a view to the general applicability of the proposed solution. On the one hand, there are the lexicons themselves with the associated, partly overlapping linguistic information. On the other hand, there are various external resources, such as text corpora representing different historical language stages, and possibly freely available external lexicons. The problem more narrowly construed consists in proposing and implementing a set of tools for interlinking the lexicons, using all and any relevant information available, as well as some kind of evaluation procedure. The interlinking should be semi-automatic, and the extent of the manual component should be explicitly indicated as part of the result (e.g., as the number and percentage of ambiguous links).
Lars Borin and possibly others, Språkbanken