  CLT seminar: Thomas François – Dmesure and FLELex: two approaches of textual complexity for French as a foreign language

Readability aims to provide reproducible and automatic methods to assess the difficulty of texts for a given population. Such methods are based on various linguistic characteristics of the texts to assess. They have mostly been developed for English (Flesch, 1948 ; Dale et Chall, 1948 ; Heilman et al., 2007), whereas very little work was carried out for French (Henry, 1975 ; François et Fairon, 2012). In this talk, we first summarize a set of experiments that have been conducted on the readability of French as a foreign language (FFL). They led to the design of a readability model for FFL able to predict the difficulty level of texts accordingly to the Common European Framework of Reference for language (CEFR). To achieve this goal, the model relies on techniques from natural language processing – to extract the linguistic features – and from machine learning – to combine these features within a statistical model.

In the second part of the talk, we will focus on a specific issue, namely the difficulty of the lexicon for learners of FFL. The lexicon is acknowledged to be an essential linguistic component for an adequate reading comprehension. In the context of the L2 education, the progression of vocabulary teaching is generally guided by vocabulary lists, such as Gougenheim (1958). These lists rely a lot on the frequency of words in a large corpus of L1 texts. Their use for L2 applications is therefore questionable. With the advent of the CEFR, this issue could be alleviated thanks to the reference supplement supposed to mark out the lexicon acquisition process. However, these references lack precision about word uses and their design is also subject to question. This has led to various attempts to evaluate their validity (e.g. KELLY, VALILEX). We will introduce an alternative approach to simple frequency list : FLELex, a freely-available resource for FFL that describes frequency distribution of words across the six levels of the CEFR. The methodology and corpus used to estimate the frequencies will be detailed and illustrated through a website that allows to query FLELex on-line.

mots-clefs: lexicon difficulty, readability of FFL, CALL, CEFR

Thomas François,
CENTAL (Université catholique de Louvain)

Date: 2015-02-05 10:30 - 12:00

Location: L308, Lennart Torstenssonsgatan 8


