• Home
  • Mapping open-domain ASR output to dialogue systems grammars

Mapping open-domain ASR output to dialogue systems grammars


Build a dialogue systems module which, given a grammar defining the strings that the system can understand, takes input from an open online ASR (e.g. Google) and maps it onto the (phonetically) nearest string generated by the grammar.

This thesis project will be carried out within Talkamatic's EU PF7 Alfred project.

Problem description

Open ASR is available over the internet, but the results are hard to use with dialogue systems with limited language understanding capabilities. Often, ASR output contains errors caused by the ASR not knowing the vocabulary of the domain which the system can deal with. The task of this project is to come up with innovative and practically useful ways of mapping ASR output to the nearest sentence (or sentences) produced by a grammar.

As a resource, the student will have a Wizard-of-Oz corpus collected in EU FP7 project Alfred, containing ASR output and transcribed speech (to be mapped to nearest in-grammar sentence).

Some ideas towards possible solutions (there may well be other, better ideas!):

  • See it as a machine translation problem?
  • Store memory of human corrections, cached as FSTs for quick application?
  • Text simplification algorithms using Integer Linear Programming

Recommended skills

Python, GF, machine translation/ILP/other method.


  • Staffan Larsson, Christos Koniaris (FLoV, GU)
  • External supervisor from Talkamatic AB.
To the top

Page updated: 2016-11-15 11:08

Send as email
Print page
Show as pdf