Skip to main content
European Commission logo print header

AUX-Variation - a corpus based study of AUX-Variation in Danish

Final Report Summary - AUX-VARIATION (AUX-Variation - a corpus based study of AUX-Variation in Danish)

Project: AUX-Variation - a corpus based study of auxiliary variation in Danish (FP7-MC-IEF no. 630059)

The AUX-Variation project has focussed on elucidating variation in the choice of auxiliary verbs in spoken Danish based on the largest and most well-structured data base yet, i.e. the LANCHART data base. The data base contains at present nearly 10 million words and is available for searches based on social background of and geographical information on speakers as well as their age at the time of recording and their year of birth. Data from six different locations covering the entire country are included. In total, approximately 17500 examples were automatically extracted and subsequently coded manually for linguistic features. This means that Anu Laanemets' study is in this way the best documented study so far of how native speakers of Danish actually use the auxiliaries HAVE and BE.
The results of the investigation reveal that the choice between the two auxiliaries is less determined by social factors, than by linguistic factors. The above mentioned sociolinguistic factors, such as gender and social class, did not have a significant effect on the auxiliary selection, only the age of speakers turned out to be relevant. In other words, with respect to this phenomenon, men and women use the language in the same way, while the language use of the youngest people differs from the middle-aged and oldest speakers. This difference indicates an on-going language change.
In more theoretical terms the findings support the hypothesis that linguistic variation in different components of language - i.e. phonology, morphology, syntax and discourse – are determined by linguistic and/or social factors to a different degree. As earlier studies have shown, phonetic and phonological variation often co-varies with social variables, while this is less likely to be the case with morphological and syntactic variables, as also confirmed by this study. In the light of that knowledge, the present project is an important piece of the puzzle established by research within the field of variationist sociolinguistics. It contributes to our understanding of the interplay between the social and the linguistic domain as determining factors of linguistic variation. The explanations for this kind of difference may at the current state of research only be informed guesses and hypotheses for further investigations. One explanation could be that syntactic variation is not a means to mark group identity, as many phonological variables. At least in languages with a strong written tradition and a fixed standard language, the use of a non-standard form would signal that the speaker does not master the languages sufficiently well. The variation is then rather connected with the notion of correctness than with group-identity.
Still, by far most of the variational pattern of auxiliary choice in Danish is explained by linguistic factors. A particularly revealing feature of this study is the acute attention to semantic differences in the main verbs used which may explain a lot of the variation. It simply seems to be different which auxiliary verb is used depending on the interpretation of the main verb. This part of the study contributes crucial new data and decisive evidence to the discussion about whether languages which are verb framed (such as Danish) have different distributions according to semantic interpretation of the main verb than languages which consistently favour a satellite framed mode of communication. In this way the general discussion about whether the auxiliaries of HAVE or BE are used may be brought into a new phase focussing on the functional semantics of the relevant constructions.

The study on how languages are structured, vary and change over time is valuable in academic and cultural terms, but it also has practical and public implications. The results of the current project contribute to the fulfilment of the EU’s language policies (ET2010 and ET2020) in which promotion of knowledge of the European languages is acknowledged as a common priority. In particular, the results contribute to our common knowledge of language diversity and promote awareness of how language variation is affected by regional, social, gender- and age-related factors. The results will in this way be of direct value to language teachers and textbook writers as well as language technologists and lexicographers. Studies such as this which discloses intricate semantic differences hitherto unnoticed in the structure of naturally occurring language also aid the development of more precise language translation software, which in a long term perspective may contribute to economic and intercultural research impact as the world becomes increasingly globalised.