SEMINAR
In this project a large vocabulary continuous speech recognition system was built on the basis of freely available Swedish speech data. One acoustic model and several bigram and trigram language models were trained with the open-source software packages HTK and CMU Statistical Language Model- ing toolkit. Using different HTK tools the system was then evaluated with these models in order to test what results can be achieved with the given data and how the language model size affects the recognition results.
The lowest word error rate achieved was 47.45% and the lowest sen- tence error rate was 87.45%. Recognition results showed that raising lan- guage model complexity—with regard to both n-gram order and vocabulary size—lowers the error rates. The error rates were rather high in comparison to the ones yielded in similar projects but the results can easily be improved by building larger n-gram models and using a decoder that is better suited for the recognition of continuous speech.
Examinator: Richard Johansson
Opponent: Joel Hinz
Supervisor: Chris Koniaris
Page updated: 2015-05-23 10:06