This project involves working closely with industry to improve accuracy of automatic translation of software interfaces and documentation by exploiting context specificity.
For instance, software source code can be mined to glean the appropriate context for individual messages (e.g., to distinguish a button from an error message). The student(s) will incorporate GF (Grammatical Framework, www.molto-project.eu) grammars into a hybrid translation system that improves on CA Labs' current statistical- and Translation Memory-based methods. User interfaces will enjoy more accurate automatic translations, and error/feedback messages will no longer be generic, but will be adapted to the user's specific interaction scenario. The first goals are to deliver high-quality translations for the most commonly used languages/dialects and to develop an infrastructure to quickly produce acceptable-quality results for new languages. Follow-on work will optimize the translation engine for performance (thereby enabling fast, off-line translation of very large corpora of documents/artifacts).
This project not only involves working closely with researchers and linguists/language experts at CA Labs, but also includes a collaboration with faculty and students at the Universitat Politecnica de Catalunya. Opportunities for either short research visits or longer internships at CA Labs are very good.
S.A. McKee, A. Ranta (Chalmers/GU), V. Montés, P. Paladini (CA Labs Barcelona)