CORDIS - Forschungsergebnisse der EU

Computational Modeling of Knowledge-Based Inference Generation during Reading Comprehension

Final Report Summary - CMOIG (Computational modelling of knowledge-based inference generation during reading comprehension)

The present project aimed to study the role of short-term working memory and long-term semantic (i.e. general knowledge) and episodic (i.e. specific text) memory associations in the generation of knowledge-based inferences during reading comprehension. To do so, an interdisciplinary methodological approach was employed, integrating computational modelling, eye-movement tracking, and response-based cognitive experiments.

Computational modelling

Semantic memory associations

To study the role of semantic memory in inference generation, semantic association strengths between textual and inferred ideas, as generated by human readers in previous behavioural studies, were computed using Latent semantic analysis (LSA) algorithm (Landauer & Dumais, 1997). LSA computes the strength of semantic associations between any pair of words or larger linguistic units (e.g. sentences and paragraphs) by assessing the frequency that a word-pair co-occurs in the same context within large text corpora. The more frequent the co-occurrence of a pair of words, the stronger the semantic association score between them.

In 18 out of 21 experimental tests, LSA successfully predicted the activation of predictive and bridging inferences compared to control unrelated ideas, as well as the relatively stronger activation of bridging over predictive inferences. Stronger LSA scores of semantic associations were observed for those inferences that were generated by human readers (indicated by response times to probes). These results have important theoretical implications particularly with regard to the activation of bridging inferences and their superiority over predictive inference. In contrast to previous assumptions, the present findings suggest that bridging inferences can be generated via autonomous spreading activation processes, similar to predictive inferences, rather than by a strategically controlled search in background knowledge. Furthermore, their stronger activation compared to predictive inferences can be explained by stronger associations with the preceding text because bridging inferences are associated with two or more textual units that are bridged by the inferred information -rather than by their supposedly crucial role in comprehending texts.

Reading comprehension model Although LSA generally was successful, its failure to simulate some of the behavioural findings indicates the need to include additional cognitive components in simulating and explaining inferential processes. Thus, LSA computations were integrated into the Landscape Model (Yeari and van den Broek, 2011) which simulates dynamic cognitive processes (e.g. activations and reactivations of textual concept representations) and structures (e.g. working memory and episodic memory connections between textual concepts) during sequential reading comprehension, clause by clause. An initial connection matrix between all textual concepts was computed by LSA before the start of a reading simulation by the Landscape Model (in the original version, initial connections were set to zero). This matrix served as relevant general knowledge a reader possesses before reading a target text. Inferred concepts were added to the simulation in the exact positions were they has been found to be generated in the behavioural studies. Using this integrated model allowed us to successfully simulate two critical findings which were not predicted based on semantic associations alone. A simulation of Klin's et al. (1999) study demonstrates the role of working memory in defining the relevant portions of text that spread their activation through semantic memory. A simulation of Singer and Halldorson's (1996) study demonstrates the role of within-episodic memory connections between textual ideas themselves in determining their level of activation and consequently their strength in activating the inferred information. More simulations were conducted on behavioural data collected in our lab. These studies and their simulations will be reviewed in the next sections.

Response-based cognitive experiments

Working memory and inferences

Many studies showed that readers with a high working memory span generate more inferences than those with a low span. However, there is less agreement about the type of activated inferences - predictive vs. Bridging- and the manner - text maintenance, text reactivation, or text inhibition – that working memory affects. In this study we compared the activation of predictive and bridging inferences and the activation, reactivation and inhibition of critical text concepts, using a probing procedure. This was done with both high and low span readers. Readers read short texts and then named as quickly as possible an inferred or textual word. Working memory was assessed using a reading span test.

No differences were found between high and low span readers in their ability to generate predictive and bridging inferences nor in their ability to maintain or reactivate prior relevant textual information. The only difference between the two groups pertained to inhibition: High span readers could better suppress textual information when it is less relevant. This filtering ability usually leaves mental resources to generate relevant inferred information, although this was not found in the present study. More research is needed to clarify the factors in this experiment that led to the specific results.

Computational simulation

Since our model is missing an inhibition component, our main goal in simulating the working memory data was to predict the strength of inference activation following the different texts we used in the experiment. We compared the means of response facilitation found for inferred probe words (the difference between naming responses to target and matched control probes) with the activation levels that these words reached following our computational simulation. Simulating 30 out of 80 texts, we found significant correlations for both predictive (r = 0.5) and bridging inferences (r = 0.42) with higher activation corresponding to stronger facilitation. These results further demonstrate the ability of our computational model to mimic inferential processes during reading comprehension.

Eye-tracking experiments

Reading goals

In this study we examined the effect of reading goals (entertainment, multiple choice questions test, open-end questions test, and presentation / teaching) on 'online' general reading-gaze patterns (the whole text) and selective reading-gaze patterns (central vs. peripheral information), using eye tracking equipment (e.g. first pass reading durations, number of backward and forward regressions, pupil size).

We found that reading goals affect the general reading time and number of regressions, with faster reading and fewer regressions when reading for entertainment, yet no effect was found for first pass reading duration of new information. First pass reading duration, but not total time and regressions, was determined solely by the centrality of information, with slower first pass reading of central information at all reading goals. Pupil size was affected both by reading goals and information centrality (no interaction between them), with smaller pupil size (implies for less mental resources used) for entertainment and peripheral information. These results shed light on individual and textual factors that influence readers' focus and mental efforts during reading comprehension.

Text highlighting

In this study we explored the effect of additional text factor - text highlighting - on reading-gaze pattern and 'offline' measurements of text recall and comprehension. We compared three conditions of highlighting: (a) high-quality - central information highlighted, (b) low-quality - peripheral information highlighted, and (c) no highlighting.

Interestingly, no significant differences were found for gaze patterns regarding central information at the different types of highlighting. The same amount of attention was allocated to central information independent of highlighting. A highlighting effect was revealed for the peripheral information in which, compared to no-highlighting condition, high-quality highlighting significantly reduced the total gaze duration and number of regressions to peripheral information, whereas low-quality highlighting significantly increased the same measurements. These results illuminate the effect of good versus bad highlighting. Note that text recall (number of propositions recalled) and comprehension performance (number of correct answers) did not differ between the different highlighting conditions, possibly due to the evidence that central information, which is crucial for recall and comprehension, received the same amount of attention in the three highlighting conditions. Central information was also better recalled than peripheral information in all highlighting conditions.

Computational simulation

The main goal of simulating the eye-tracking studies was to test whether the integrated model can predict the processing differences found for text information at different levels of centrality. In this simulation, the basic units were clauses rather than individual words. We compared the centrality score and recall frequency of each text clause, as obtained in the no-highlighting condition of the eye-tracking study, with the sum of connection strengths that each clause established with the other clauses of the same text, following the computational simulation. We expected that successful simulation would yield stronger connections of the more central, better recalled clauses with the other clauses. Taking all texts together, a significant correlation was found between the sum of clauses' connection strengths and centrality score (r = 0.21) and recall frequency (r = 0.18). Note that correlation with centrality score (but not with recall) was improved when sum of activations during the simulation (rather than connections) was used (r = 0.24). Moreover, correlations were higher (up to r = 0.5) for individual texts. For 7 out of 9 texts, correlations have reached significant. These results further validate the computational model.


The present project succeeded to develop a successful computational model that integrates, for the first time, dynamic reading processes with knowledge-based structures, and simulates inferential processes in a fully automatic manner, independent of experimenter's intervention. This model was applied in different ways in the present project, and can be further used for simulating many other comprehension phenomena in the future (e.g. the relative effect of semantic and episodic associations to comprehension). Alongside the computation endeavour, this project yielded important results regarding various elements of the reader (working memory, filtering irrelevant information, reading goal) and the text (information centrality, highlighting) which may underlie successful reading comprehension.