Discourse Functions and Representation: an Empirically and Linguistically Motivated Inter-Disciplinary Approach to Natural Language Texts


The empirical study of discourse has reached sufficient maturity that it can and should be brought to bear on formal and computational models. Our aim is to gear text-analytical and psycholinguistic research more directly towards this goal, and to incorporate the empirical results in state-of-the-art formal and functional discourse representations.

We will attempt to represent (a selection of) the linguistic resources needed for the generation of text in a declarative and modular way. These representations will be complementary to and compatible with existing representations of linguistic knowledge at the level of the grammar and the lexicon.
Research has been carried out in order to develop a language independent theory of discourse on the basis of linguistic analysis and psycholinguistic experimentation. The theory models the interactions between preferences for and constraints on surface forms that may express the same proposition but are pragmatically different.

Theoretical and empirical work was carried out in the following areas:
text types, global and local discourse structures;
chaining and thematic progression;
temporal structure and temporal connectives;
discourse functions of NP-anaphora.

The available grammar resources for the 3 target languages were inventarized and evaluated with respect to their potential to accept input from a discourse interface. The interfacing of discourse thematic notions with lexicogrammatical thematization options was explored for a systemic grammar of German.

Discourse functions and their grammatical realisations are investigated and modeled for three European languages: English, German, and Dutch. This guarantees a minimal degree of language independency and makes our research directly relevant to machine translation. We limit our attention to written, monological discourse, excluding interactional phenomena like questions, answers, and acknowledgements, and speech-specific phenomena like accent-placement and intonation. The result will be an executable specification of the properties of discourse that need to be enforced whenever text is used, generated, analysed, and so forth. It will be declarative and nondirectional, and it will not make any claims about optimal or human-like processing.

The research will profit from existing theories of discourse representation originating from the sentence-based orientation. One of the theoretical results of the project will consist in an overview of the augmentation that essentially sentence-based accounts will need in order to become fully-fledged theories of discourse.


The research carried out in this project is a prerequisite for attaining the long-term goal of developing computational devices that can understand and generate natural language discourse in context. The project bridges interdisciplinary gaps by incorporating empirical results in formal and functional linguistic representations. The development tools being built for the project and the executable specifications of form-function mappings will contribute to the construction of a discourse researcher's workbench for the study of complex interactions of contextual factors and linguistic phenomena.


