In a first study, we asked how flexible are predictions? When we hear a word mismatching our lexical and semantic predictions, are our brains able to quickly reroute them in a different direction? And does this redirection also involve suppressing the previous, no longer viable, predictions? To this end, we ran an EEG study in which participants read short sentences presented to them on the screen. For example, they could read the following sentence “He always hid an extra set of keys under a …“. From additional behavioral tests, we learned that people expect that such sentences will be continued by “mat”, or to a smaller extent, by “rug”. We were interested in looking at whether preceding these nouns with an adjective, such as “rubber” or “Persian”, will lead to quickly updating the degree to which participants expect the nouns. In particular, we were curious if preceding the noun with an adjective promoting the noun (“rubber mat”, “Persian rug”) will increase the participant’s predictions of the noun, and more crucially, if preceding the noun with an adjective promoting the other noun (for example, “rubber rug” or “Persian mat”) will suppress the participants’ predictions about the noun. We focused on the brain’s reaction to the noun. We were interested in the amplitude of the N400 component, an index of the degree to which the meaning of a given word was activated by reading the preceding part of the sentence. We found that the adjectives indeed modulated the N400 amplitude to the noun. This showed us that predictions can be flexibly redirected on very short time-scales, on a word-to-word basis.
In the second study, we asked a more fundamental question: Do these predictions occur at the semantic or lexical level? Is the brain helped by any degree of overlap in the meaning between the semantic context and the actual word, even, if in general, the word itself is completely unpredictable? For example, both "dog" and "tree" are improbable continuations of the sentence "He invited a famous ...", but still, the first word continuation may still fit the context better because “dog” is animate and thus is more likely to be invited than “tree”. If the predictions are formulated at the semantic level, then the processing of both words will differ because “dog” shares more semantic features with the context than “tree”. If predictions are formulated at the lexical level, then both words should lead to similar processing difficulty because they are both improbable as the sentence continuation. We addressed these questions by employing GPT-2 – a state-of-the-art computer model of English (similar to, for example, models employed by Google which help them understand the search queries typed in by users) which is able to “understand” the sentence and estimate the probability of any word at any position in the sentence in a way that is sensitive to overlap in semantic features. For example, in the above sentence, the model clearly estimates that “dog” is far more probable than “tree”, even though both words have a very small probability. In all experiments, participants read short sentences, while their EEG was recorded. The final word of each sentence varied in its predictability, and many of the sentence endings were unpredictable (but their probability estimated by the language model still varied). Overall, our analyzes of the N400 amplitude to the sentence endings showed that predictions are formulated at the level of semantic features (see attached figure: brain waves elicited by the final word of tested sentences, depending on their probability as estimated by experiment participants and by language model; the differences between the waveforms occur in the N400 component).