Periodic Reporting for period 4 - STATLEARN (The reading brain as a statistical learning machine)
Reporting period: 2021-03-01 to 2022-02-28
STATLEARN tested this conjecture combining techniques from Computational Linguistics, Experimental Psychology and Neuroscience. We carried out experiments with adults and children, using both behavioural methods and state–of–the–art technologies such as EEG, eye tracking and MEG. We also tested natural reading, as well as simulated learning of new writing systems in the lab. Overall, this rich set of experiments showed that reading and word learning do build on sensitivity to letter statistics, and do so from a very young age. This seems to be based on a mechanism that is not specific for language, and that is at play in non-linguistic animals as well, at least in a rudimentary form.
Given that reading is one of the most widespread human activities and is critical to navigate the modern society, this project promises to have far–reaching impact. By providing new insight on how we acquire literacy, STATLEARN may inform how we diagnose and treat developmental and acquired dyslexia, and how written language is taught in schools. More generally, the project may shed new light into the incredible learning and information processing abilities of the human brain.
Words that share part of their internal structure (e.g. [mind]ful and [mind]less) are connected in the human brain and cognitive system. We have carried out a series of chronometric experiments to test whether this is related to letter co–occurrence statistics—we would notice the presence of “mind” in “mindful” and “mindless” because the letters m, i, n and d occur together often in the language. The data we gathered so far suggest this not to be the case.
In a second series of experiments, we asked our participants to learn a bunch of novel words in the lab, and tested whether they relied on letter co–occurrence statistics in doing so. When the experiments involved familiar letters, participants tended to apply the statistics of their native language, rather than learning new regularities based on the novel words. When we used an unfamiliar alphabet instead, participants did seem to capture the co–occurrence pattern between the novel characters.
We also looked for brain signatures of sensitivity to recurring chunks of letters. We did so by presenting these chunks periodically into a stream of visual events, and assessing whether the brain synchronises its rhythm to this same periodicity. The data suggest this to be case, at least in areas typically deputed to higher–level vision (i.e. the left occipito–temporal cortex). We also obtained evidence that this sensitivity is enhanced by meaning—the brain responds more to recurring clusters of letters that also carry a consistent meaning (e.g. “ness” in “kindness”, “fairness” and “bitterness”, or “er” in “driver”, “dealer”, or “baker”).
We complemented this evidence on adults by looking into when sensitivity to letter statistics emerges in children learning to read. Eye tracking data on text reading suggest that children are already sensitive to the frequency with which given letter combinations occur in the language in Grade 3, and that this information guides their visual exploration of the written text. We also investigated whether this sensitivity shows up in brain signatures, and particularly in the capacity of the brain to entrain with external stimuli.
This set of data has generated a large number of conference presentations and papers, which are reported in the Publication and Dissemination sections. Moreover, several papers are in the making, as detailed below:
1. Lelonkiewicz et al., Morphemes as letter chunks: Linguistic information enhances the learning of visual regularities. [submitted]
2. Pescuma et al., Automatic Morpheme Identification Across Development: Magnetoencephalography (MEG) Evidence from Fast Periodic Visual Stimulation. [submitted]
3. Hasenacker et al., Prediction at the intersection of sentence context and word form: Evidence from eye-movements and self-paced reading. [submitted]
4. De Rosa et al., Co-occurrence statistics affect letter processing. [in preparation]
5. De Rosa et al., Selective Neural Entrainment Reveals Hierarchical Tuning to Linguistic Regularities. [in preparation]
6. Pescuma et al., Eye movements during natural reading reveal sensitivity to orthographic regularities in children. [in preparation]
7. Pescuma et al., EyeReadIt: A developmental eye-tracking corpus of text reading in Italian
8. Lelonkiewicz et al., Spontaneous human-like string processing in rats. [in preparation]
9. Lelonkiewicz et al., Lexical diversity and word learning based on letter statistics. [in preparation]
10. Ktori et al., Affix-like chunks determine lexical plausibility in children. [in preparation]
11. Ktori et al., Morpheme position coding in compounds. [in preparation]
12. Franzon and Crepaldi, Feature coding in morphological contrasts. [in preparation]
In addition, two conference presentations are also underway:
1. European Society for Cognitive Psychology (ESCoP), August 2022 (by Maria Ktori)
2. Interdiscplinary Advanced in Statistical Learning, June 2022 (by Davide Crepaldi)