• Home
  • Classification of learner essays by achieved proficiency level

Classification of learner essays by achieved proficiency level

Classification of learner essays by achieved proficiency level

Goal

Developing an algorithm (web services) for automatic classification of Swedish learner essays by their reached proficiency level.


Background

Suggested approach would be to use machine learning for essay classification. The challenge is to identify features that would be both aware of the Second Language Acquisition (SLA) research and informative of the task at hand.

The classification will be made in terms of the levels of proficiency according to the Common European Framework of Reference (CEFR), which covers 6 learner levels: A1 (beginner), A2, B1, B2, C1, C2 (near-native). At the moment we have electronic corpora of essays at levels B1, B2, and C1. Essays at A2 are hand-written and haven't yet been digitized and annotated (which presumingly can be done in time for the project, if someone picks this topic).


Problem description

The steps for this project would include:

  • background reading on the topic of SLA, CEFR, essay grading and learner essay classification by levels. See one example for Swedish essay grading (NOT in terms of levels, but in terms of grades, i.e. (Väl/Icke) Godkänd: http://www.ling.su.se/english/nlp/tools/automated-essay-scoring
  • testing approaches for the best-performing classification
  • implementation of web service(s) for learner essay classification
  • (potentially) implementation of Lärka-based user interface where new essays can be tested
  • (potentially) evaluation of the results with teachers & new essays


Recommended skills:

  • Python
  • jQuery
  • interest in machine learning


Supervisor(s)

  • Elena Volodina/Ildiko Pilan
  • potentially others from Språkbanken/FLOV
To the top

Page updated: 2016-11-15 11:07

Send as email
Print page
Show as pdf

X
Loading