In the first 12 months, the technical and scientific activities of the iRead4Skills project were mainly developed under the scope of the work packages (WP) described below.
- WP2 "Skills surveys: needs, skills, and gaps" assessed low literacy adult learners and trainers in AL and VET centres, focusing on training activities, motivation, reading skills, habits, and preferences. Surveys included:
i) Reading Skills survey for adults and trainers in Portugal, Belgium, France, and Spain.
ii) Overall skills and gaps survey examined how reading difficulties affect employment access and well-being, covering personal confidence, other skill acquisition, and impacts on various life aspects. Survey results informed a final literature review and report on literacy's impact on skills and work life.
- WP3 "Complexity classification and data" identified the relevant complexity levels for low literacy adult learners to guide data collection for the analysis system. Tasks included:
i) Defining complexity levels based on proficiency descriptors for Spanish, French, and Portuguese texts.
ii) Compiling and annotating text datasets for French, Portuguese, and Spanish, with ongoing enhancement and validation by end-users.
iii) Defining lexicons per complexity level for each language, using available resources and expertise.
The achievement of these stages corresponds to the first milestones of the project, Milestone 1: Skills and needs data, and Milestone 2: Data sets (whose last result will be fully accomplished in month 15).
- WP4 "Intelligent complexity analysis" activities included:
i) Defining features for complexity analysis and implemented readability predictors.
ii) Investigating different approaches for encoding text information and developing python APIs for each target language to automatically extract features were key tasks. The APIs use NLP processing tools tailored to each language and include additional lexico-semantic features derived from word embeddings. The intelligent complexity analysers provide the basis for computing larger sets of readability variables and annotating linguistic phenomena related to reading difficulties in the text.