What CEFR Solutions Can Publishers Use?

pexels-photo-256541_narrow.jpg

Converting content into smart content is crucial for publishers to survive in the digital and artificial intelligence age. Educational publishers have a plethora of content which needs to be accurately stored and labelled to the right language or reading level. Hence, following the CEFR* scale is key for publishers to match content to the right readability level. In this blog post, we'll examine what kind of solutions exist in the market that offers CEFR classification.

Artificial intelligence has built-in algorithms which can classify content sidestepping the need for manual tagging performed by experts. Such automatic tagging can be performed by algorithms, which assign individual pieces of content an appropriate language level based on an objective standard.

Several companies have technology that can tag content, but in different ways than the CEFR scale.

1. Wizenoze (Netherlands) Wizenoze uses machine learning to estimate the readability of educational texts. Wizenoze classifies text according to their own Index – the Wizenoze Readability Index. This Index is based on a score of 1-5 with each level roughly comparable to UK and US educational stages. Their tool has several additional functions such as finding simpler alternatives for difficult terms. However, Wizenoze doesn’t provide any CEFR classification.

2. Aylien (Ireland) Aylien uses Natural Language Processing to extract insights from textual content. Using APIs Aylien analyzes texts based on sentiment categorizes them according to IAB-QAG & IPTC News Codes and extracts metadata from them. Aylien also has several helpful features like suggestions for hashtags or tools to remove clutter from web pages. The Aylien tool can also detect which language a text is written in and assign it a confidence level score, based on its own criteria. However, Aylien doesn’t provide any difficulty reading level of a text and definitely not CEFR classification.

3. UNSILO (Denmark) UNSILO brings publishing together with Artificial Intelligence. Primarily focused on evaluating manuscripts and building content packages for new business opportunities, UNSILO is not specifically designed for educational publishers. However, they can provide custom metadata tagging solutions, using APIs focused on key concepts and related content. UNSILO does not assign difficulty reading levels to content.

4. Watson IBM (United States) IBM’s Watson is the grandfather of Natural Language Processing for content. It can extract entities, relationships, keywords and semantic roles from unstructured data. It currently operates in 13 different languages, although it doesn’t provide an assessment of language difficulty. It isn’t designed specifically for educational publishers or even publishers, but no list of metadata-tagging tools would be complete without it.

5. EDIA (Netherlands) EDIA uses Artificial Intelligence and machine learning to automatically metatag and classifies content for educational publishers. EDIA’s primary solution classifies texts according to language level using the Common European Framework of Reference for Languages (CEFR). This standard is used broadly across Europe and assigns content based on 6 levels of complexity.

It’s clear that there are many different ways for publishers to automatically metadata tag content for different language levels. Different classification standards are used across the industry, which can sometimes make it difficult for publishers to maintain a standardized system. However, the options available are manifold, meaning educational publishers can choose the right tool for their environment and market.

*The Common European Framework of Reference for Languages (CEFR) is an international language learning standard set by experts from the Council of Europe. The CEFR scale has reading levels ranging from A1 (beginner) to C2 (native).


About EDIA

EDIA education technology was founded in 2004 and is based in Amsterdam, the Netherlands. In 2006, EDIA launched its first AI product for education, which used machine learning and natural language processing to curate online text sources for vocabulary training. The product won several international awards and is still widely used today. In recent years EDIA transformed into a SaaS platform by applying Artificial Intelligence technology to analyse text.

To learn more, schedule an appointment with EDIA sales team today.