Skip to main content

Uncovering the nature of human sentence processing: a computational/experimental approach

Final Report Summary - HUMSENTPROC (Uncovering the nature of human sentence processing: a computational / experimental approach)


How do people understand language? One fairly uncontroversial finding is that this cognitive process can be viewed as probabilistic: over the course of comprehension, people entertain multiple possible interpretations to varying degrees, which lead to probabilistic predictions about upcoming input. If the actual next input matches these predictions, processing is facilitated. Otherwise, it can be disrupted.

The most influential formalisation of this view is based on the concept of 'word surprisal', an information-theoretic measure of the extent to which a word occurs unexpectedly. Surprisal values can be computed from word probabilities given the sentence context. These probabilities, in turn, follow from a probabilistic model of the language.

It is well known that surprisal values account for word-reading time measures which shows that surprisal is indeed a cognitively relevant measure of comprehension difficulty. However, because it is very difficult to accurately estimate word probabilities, the problem was always simplified: probabilities are estimated over the word syntactical categories rather than the words themselves.

Different probabilistic language models make different assumptions about the structures and statistics underlying sentence comprehension. For example, one model may assume that sentence comprehension involves the construction of hierarchical syntactic structures (as they are used by linguists to analyse sentences) whereas another model may not include any such structure and only look at the sequential order of words. These two models will estimate different word surprisal values, and by comparing their ability to predict reading times, we can identify the model that is closest to cognitive reality.


The overarching objective of the HUMSENTPROC project is to increase our understanding of human sentence comprehension. This objective was originally subdivided into three research goals:

(1) To implement a range of probabilistic language models, with differing underlying assumptions. All are trained on the same, large corpus of English texts. Importantly, they estimate probabilities over words instead of syntactical categories. Consequently, they more realistically simulate human language processing and are less theory dependent than earlier models.
(2) To collect empirical measures of the cognitive processing difficulty people experience when reading. The experimental stimuli are a collection of sentences that forms a random sample of English, and processing difficulty is measures by both reading times and electroencephalography (EEG).
(3) To identify the psychologically most plausible model by comparing models' surprisal estimates (over the experimental sentences) with collected word-reading times and different event-related potential (ERP) components of the EEG signal. The best model's assumptions regarding structures and statistics are the most accurate simulation of the cognitive process of sentence comprehension.

Over the course of the project, two further goals were added:

(4) To collect pupillometry data as an additional measure of cognitive processing difficulty during reading.
(5) To investigate whether 'entropy reduction' also has relevance as an information-theoretic measure of cognitive load.


(1) Two word-level language models were successfully implemented. One was a phrase-structure grammar (PSG) which relies crucially on hierarchical syntactic structure. The other was recurrent neural network (RNN) which does not use any hierarchical structure. A third model (temporal recurrent Boltzmann machine) was implemented but did not learn the language statistics well enough to test it on the psychological data.
(2) Reading-time and EEG data were successfully collected. For reading times, two different methods were applied: self-paced reading and eye-tracking.
(3) The RNN model accounted for significantly more variance in the reading times than did the PSG, suggesting that readers do not construct hierarchical structure but rely more on sequential structure. Due to technical difficulties (now solved) with extracting ERP components from the raw EEG data, the analysis of this data is still underway. Initial results indicate that there is an effect of surprisal on the N400 component but that there is little or no difference between the two models.
(4) Pupillometry data was successfully collected and confirmed the findings from reading times: Surprisal estimates by the RNN, but not by the PSG, predict pupil dilation during reading.
(5) Entropy-reduction values, as estimated by the RNN, account for reading times over and above what is already predicted by surprisal. Conversely, surprisal also accounts for reading times over and above entropy reduction. Obtaining entropy-reduction estimates from the PSG is computationally not feasible.


The project convincingly showed that information theory is applicable to human sentence processing: Words take longer to read if they have higher surprisal or entropy-reduction value. Also, it was shown that physiological measures of processing difficulty (pupil dilation and the N400 ERP component) are sensitive to surprisal. Most importantly, however, both reading-time and pupillometry data showed stronger effects of word surprisal estimated by the RNN than by the PSG. These findings strongly suggest that hierarchical syntactic structure is not as important to sentence comprehension as is traditionally thought. Whether the EEG data allows for the same conclusion is as yet unclear.