Build a dialogue systems module which, given a grammar defining the strings that the system can understand, takes input from an open online ASR (e.g. Google) and maps it onto the (phonetically) nearest string generated by the grammar.
This thesis project will be carried out within Talkamatic's EU PF7 Alfred project.
Open ASR is available over the internet, but the results are hard to use with dialogue systems with limited language understanding capabilities. Often, ASR output contains errors caused by the ASR not knowing the vocabulary of the domain which the system can deal with. The task of this project is to come up with innovative and practically useful ways of mapping ASR output to the nearest sentence (or sentences) produced by a grammar.
As a resource, the student will have a Wizard-of-Oz corpus collected in EU FP7 project Alfred, containing ASR output and transcribed speech (to be mapped to nearest in-grammar sentence).
Some ideas towards possible solutions (there may well be other, better ideas!):
Python, GF, machine translation/ILP/other method.