• Home
  • Language Technology for Languages of the Central African Republic

Language Technology for Languages of the Central African Republic

A small project to make use of data on languages of the Central African Republic.

Duration: Oct 2007-

Participant: Harald Hammarström, PhD Student

This is an-going to project to retrieve (from diskettes):

  • A Sango corpus
  • A Mpiemo text collection and dictionary

Then the resources will be used in the following manner:

  • POS-induction and morphology induction algorithms will run on the Sango and Mpiemo data respectively
  • The corpora will be converted to suitable formats such as TEI XML for use in ITG (a pedagogical tool for teaching grammar based on tagged corpora)
  • A website will be built with information on the languages of the Central African Republic in general and where the resources will be made available
To the top

Page updated: 2009-10-08 09:20

Send as email
Print page
Show as pdf

X
Loading