The European Language Grid (ELG) selected EDIA's 'CEFR Labelling and Assessment Services'. The EU funds EDIA’s efforts towards automated readability assessment in NL, DE, ES and FR

THE PROJECT IN A NUTSHELL

Our project aims to develop a set of data collection and annotation tools to facilitate the creation of data sets (corpora), which can be used to develop classification. These can automatically assess a text's reading difficulty against the Common European Framework of Reference (CEFR). The ability to accurately and consistently check the readability level of texts is crucial to authors and teachers. It will allow them to create and discover content that meets the needs of students with different backgrounds and skill levels.

EDIA already provides automated readability assessment technology (available as API and authoring tool) for the CEFR which is currently available for English. Through this project, additional languages will be supported (ie. Dutch, German, French and Spanish). As part of the project, we will also build an infrastructure that will pave the way for adding other languages in the future.

More details in our blogpost

 
ELG EDIA Model

ELG EDIA Model

 

"The selected project proposal outlines how EDIA will develop an application using language resources and technologies available in the ELG. We are glad to be able to fund this project, as it is not only technologically interesting but also has an important mission: fostering learning and education. The Pilot Board is convinced that this project will not only create outstanding project results in the form of tools and services but will also provide the ELG with valuable insights into the usability of our portal."

- Katrin Marheinecke, project manager, European Language Grid-

WHY DOES THE CEFR MATTER?

The CEFR (Common European Framework of Reference for Languages: Learning, Teaching, Assessment) aims to provide a comprehensive learning, teaching, and assessment method that can be used for all European languages. Indicating the level of learners of foreign languages in Europe and beyond, the CEFR facilitates the assessment of a person's language proficiency. By now, most are familiar with the six reference levels (A1, A2, B1, B2, C1, and C2) used for this purpose.

 
THE well known CEFR scale.png
 

CEFR levels are the foundation for a communicative approach to (foreign) language acquisition, teaching, and certification. Although the CEFR levels represent a widely supported approach, the availability and quality of educational content labelled with CEFR levels are limited. That's because the highly laborious, error-prone labelling process is performed manually (save for some exceptions). This results in several practical obstacles regarding publishing, teaching, and learning:

  • Content creators (publishers, authors, and teachers) struggle to use consistent criteria for checking a text's difficulty level.

  • Schools and teachers have trouble finding and/or creating appropriate texts for their students.

  • To tackle this problem, we have developed an automated text classification technology using natural language processing. This technology can perform CEFR text levelling in a scalable, consistent manner for multiple languages at a very granular level.

By removing blockers through automation, we expect to impact the practical application of CEFR, enabling the labelling of more content in less time in a highly consistent manner. This way, we will lay the foundation for making educational resources with properly labelled text levels more widely available, adhering to the CEFR standard. After all, practical obstacles will have been eliminated.

About the ELG

The ELG will strengthen the commercial European Language Technology landscape by establishing a pan-European marketplace. Offering powerful multilingual, cross-lingual, and monolingual technologies, the ELG will contribute to the emergence of a truly connected, language-crossing Multilingual Digital Single Market. It will create a digital marketplace where European companies can showcase and offer their language technologies to customers. The ELG will also provide technologies to the European citizens, public administrations, and NGOs.

 
CEFR cost saving.png
 

For example, to classify 1,000 content items would cost € 40,000 and more than 10 workdays using manual tagging (excluding hours required to find the grading experts). By using the EDIA automated CEFR tagger it would take approximately 10 minutes for € 500, resulting in a 90% saving for your organization, as well as significantly increased efficiency and time savings. 


READY TO GET STARTED? REQUEST A CALL WITH US


Get more insights. Review our collection of case studies, reports, and more: