Discourse, Context & Media 1 (2012) 95–102
Contents lists available at SciVerse ScienceDirect
Discourse, Context & Media journal homepage: www.elsevier.com/locate/dcm
Diachronic changes in subjectivity and stance–A corpus linguistic study of Dutch news texts$ Kirsten Vis a,n, Jose´ Sanders b, Wilbert Spooren c a
Utrecht Institute of Linguistics OTS, Utrecht University, The Netherlands Centre for Language Studies, Radboud University Nijmegen, The Netherlands c Department of Language and Communication, VU University Amsterdam, The Netherlands b
a r t i c l e i n f o
abstract
Available online 22 September 2012
In several studies of English data, researchers have observed a trend of ‘informalization’: a shift of stylistic preferences in public written discourse, such as journalistic texts, towards a more conversational, or oral, style. In this paper, we aim to contribute to this issue by empirically testing this informalization thesis for Dutch. For this purpose we operationalize informalization in terms of linguistic expressions of subjectivity. Subjectivity is considered here as the expression of speakers of themselves and their own ‘private states’, such as attitudes, beliefs, opinions, emotions and evaluations. Our model of subjectivity includes elements such as personal pronouns (first and second person), modal verbs and modal adverbials. Comparing newspapers from 1950/1 and 2002, we are able to show that, based on those parameters, subjectivity in Dutch newspapers has increased. However, it is not primarily journalists who express themselves and their private states more: rather, the increasing subjectivity lies in the citations of words of other speakers embedded in the newspaper articles. The use of direct quotations has almost doubled, and the subjectivity expressed in the quotations has increased dramatically as well. It seems, then, that in this case the subjectivity assumed in the informalization thesis lies primarily in the proportion of quoted speech of characters in the news texts and in the subjective content of that speech. Informalization does not occur primarily through a more oral style in the journalist’s text, but through literal citations of conversations of other speakers. & 2012 Published by Elsevier Ltd.
Keywords: Subjectivity Stance Informalization News Conversation Oral style
1. Introduction: a change towards an oral style In linguistics a fundamental distinction is made between spoken and written discourse (e.g. Chafe, 1982). In recent years, researchers from several backgrounds have suggested that the gap between the two types of discourse is narrowing. In particular, public written discourse seems to be moving towards a more conversational, or oral, style. Such a shift in stylistic preferences has been observed in both socio-cultural approaches to language and more corpus-based studies of several types of public discourse, including journalism and politics. A socio-cultural approach to this change has been provided by Fairclough (Fairclough, 1994). Within the tradition of critical discourse analysis he has signaled a tendency in public communication – such as journalism, politics, and communication on health care and education – ‘‘from a distant, impersonal, formal
$ The authors wish to thank two anonymous reviewers and the editors of the volume for their valuable comments on an earlier version of the manuscript and Mike Hannay for correcting our English. n Correspondence to: Faculty of Humanities, Utrecht University, Trans 10, 3512 JK Utrecht, The Netherlands. Tel.: þ 31 30 253 8757; fax: þ31 30 253 6000. E-mail address:
[email protected] (K. Vis).
2211-6958/$ - see front matter & 2012 Published by Elsevier Ltd. http://dx.doi.org/10.1016/j.dcm.2012.09.003
public discourse towards conversation and personalized discourse’’ (Fairclough and Mauranen, 1997: 117–8). However, either his empirical evidence, even though intuitively plausible, is limited to anecdotal examples from present-day language use, or otherwise the discussion is restricted to an analysis of a number of individual politicians, as is the case of the diachronic study in Fairclough and Mauranen (1997). Other researchers have found more systematic signs of a shift towards conversations in public communication. In a diachronic corpus study (1.7 million words from 1650 to 1990) Biber and Finegan (2001) described a shift in popular written discourse (letters, news reporting, diaries, fiction) from a ‘literate’ towards a more ‘oral’ style. The term ‘oral’ refers to language elements produced in situations that are typical of or expected for speaking, such as first and second person pronouns, interjections and hedges, and demonstratives, whereas the term ‘literate’ refers to language produced in situations that are typical of writing, such as nouns, adjectives, prepositions and lengthy words in general. Biber and Finegan found that the specialist expository genres (medical prose, science prose, and legal prose), however, have followed a consistent course towards ever more literate styles. Similarly, in a study of the shift in news and academic prose
96
K. Vis et al. / Discourse, Context & Media 1 (2012) 95–102
towards a more oral style (320,000 words), Hundt and Mair (1999) found that it is genres which differ in their openness to innovation: news, for instance, is relatively open to innovation (‘‘agile’’), whereas academic prose is rather reluctant to change (‘‘uptight’’). From the results of a diachronic study of 560 texts from four different genres (news, drama, personal letters and medical prose) over a period from 1650 to 1990, Biber (2004) concluded that there are major changes in social norms that are reflected by the various genres. Expressions of opinions, attitudes and emotions occur most in conversations (Biber et al., 1999), but in other genres speakers are more inclined to express their opinions and attitudes as well, especially over the last fifty years. In a corpus study of 120 editorials from The Times (85,000 words), Steen (2003) found an increase in involvement and a decrease in what he, following Biber (1988), marks as narrativity: past tense, perfect aspect and third person. In his view, this can be interpreted as a stylistic change towards conversations, as these typically show involvement and are persuasive in nature and show few markings of narrativity. On the basis of a diachronic corpus analysis (100,000 words), Cotter (2003) also concluded that newspapers adopt characteristics from conversation: in regional newspapers from California she found an increase in the frequency of sentence-initial use of and and but, a phenomenon that is typically associated with spoken language. In a diachronic corpus analysis (37,000 words) of 32 British political broadcasts between 1966 and 1997, Pearce (2005) found an increase in informalization, in linguistic characteristics such as a decrease in the number of nominalizations, and an increase in the use of first and second person pronouns and the use of thatcomplement clauses. In the different studies several terms have been introduced to highlight the shift in stylistic preferences, such as colloquialization (Mair, 2006), conversationalization (Fairclough, 1994; Biber, 2004), and informalization (Pearce, 2005). We will use the most general term, informalization. Although most research on informalization focuses on English, such as the studies presented above, some studies suggest that it is not confined to English (cf. Steger, 1989; ¨ Kubler, 1985 for German, and Biber, 1995 for Somali). The question remains whether the generally well-documented trend can also be observed in Dutch written public discourse, more specifically Dutch journalistic texts. This question is especially interesting since there is evidence that the journalistic practices in The Netherlands developed differently in the 19th and first half of the 20th century compared to the United States of America and the United Kingdom (Broersma, 2007): Dutch newspapers did not implement certain stylistic innovations in this period, such as selecting news on the basis of its news value instead of on the basis of political bias, using the inverted pyramid formula, in which newspapers started with a summary lead, and taking objectivity as the moral norm for ¨ reporting (Høyer and Pottker, 2005), when Anglo-American newspapers were undergoing major changes. For example, it was only in the 1930s that Dutch newspapers adopted the routine of strictly separating facts from opinions, while the sensationalist topics and emotional and involving style employed by Anglo-American newspapers were still considered indecent by most Dutch journalists. Furthermore, the Dutch journalistic landscape differed from the Anglo-American one because of the strict compartmentalization of Dutch society along socio-political lines into several ideological segments in the first half of the 20th century. Most newspapers were deeply rooted in one of these segments, institutionally, with respect to their content, and with respect to their readership. It was only after the disintegration of this segregated social structure in the 1960s that the segregated journalistic culture was replaced by an autonomous journalistic culture, in which norms were derived from professional journalist practice, and a professionalization of
journalism and of media more generally took place (see Vis, 2011: section 1.4, for more information). The results of the studies mentioned above form a good point of departure to test the informalization thesis for Dutch journalistic discourse. In this study we will trace informalization in newspapers through the analysis of a feature typical of conversation: the involvement of speakers and their attention to themselves, their opinions and attitudes, and to the presence of a listener.
2. Subjectivity and stance Following Biber (2004), Biber and Finegan (2001), Cotter (2003), Hundt and Mair (1999), Pearce (2005) and Steen (2003) we look at features of spontaneous conversation. Among the characteristics of spontaneous spoken interaction are (Clark, 1996) presence of the speaker, self-expression of the speaker, and expression by the speaker of attention for the presence of the listener. In the literature these features are grouped under labels such as ‘subjectivity’ (as in (Langacker, 1990) and ‘stance’ (as in Biber, 1988; Pearce, 2005). Subjectivity is defined as the expression of the speaker’s attitudes, beliefs, opinions, emotions and evaluations (Langacker, 1990; Lyons, 1994; Wiebe et al., 2005); stance is the expression of emotions, attitudes and opinions (Biber et al., 1999). Although ‘stance’ and ‘subjectivity’ belong to different theoretical paradigms, there is considerable overlap between the concepts, as is clear from the definitions. For the purposes of this study, we will draw on previous research on both subjectivity and stance. Biber (2004) studied the diachronic development of stance in four genres through an analysis of the three major structural types of stance marking (Biber et al., 1999): (semi-) modal verbs (can, must, got to, be going to), stance adverbials (hopefully, certainly), and complement clause constructions (suggest that, possible that, hope to, appear to, likely to). Scheibman (2002) analyzed subjectivity in American English conversations. Her corpus analysis resulted in a list of general features of subjectivity, such as personal and possessive pronouns (first and second person singular and plural), modal verbs, intensifiers and modal adverbs. Wiebe (1994) developed a computational linguistic model for automatic identification and extraction of opinions and emotions. On the basis of a large scale corpus analysis she distinguished several linguistic elements that she claimed express subjectivity, such as exclamations, direct questions, elements that express evaluation or judgment (awful, incredibly) and evidentials (surely, might). In a corpus of Dutch newspaper texts Bekker (2006) studied the relationship between subjectivity and sentence order. The potentially subjective elements that she distinguished include expressions of modality, evaluations, expressions of mental activity, and communicative actions. This overview shows that the terms ‘‘subjectivity’’ and ‘‘stance’’ are used interchangeably with respect to the same linguistic features. In this way, the two concepts are strongly related. The frequency of linguistic features of subjectivity and stance is seen, then, as the measure for informalization. From here on we will only use the term ‘subjectivity’ to refer to both ‘subjectivity’ and ‘stance’. Special attention should be paid to the concept ‘speaker subjectivity’, as introduced by Bekker (2006). It refers to the subjectivity of the speaker of a text as opposed to the subjectivity of other persons that appear in the text. This distinction is important. Subjectivity is, as mentioned before, the expression of speakers with respect to themselves and their private states; it foregrounds speakers and their attitudes and opinions. In newspapers the words of the writer of the article, the journalist, are presented as
K. Vis et al. / Discourse, Context & Media 1 (2012) 95–102
well as the speech of other speakers, news sources. Journalists use different discourse forms to present the speech of other speakers. The forms most frequently used in Dutch newspapers are as follows:
Direct speech
Direct speech is marked through the use of quotation marks, to make explicit what has been uttered by another person. The quotation is signaled by a verb like say or by a phrase like according to: (1) ‘Dit is voor ons absoluut niet te controleren’, zegt Sevenstar-directeur Koolhof. (Algemeen Dagblad, May 3, 2002, Section: domestic news) ‘It is absolutely impossible for us to check this’, says Sevenstar managing director Koolhof. Partial quotes Quotation marks also make it explicit that one or more words (but not entire sentences) have been uttered by another person: (2) Volgens bronnen binnen de bank vindt Smits dat het werken hem ‘onmogelijk is gemaakt’. (Trouw, September 3 2002, Section: financial news) According to sources within the bank Smits thinks that working ‘has been made impossible’ for him. Indirect speech When indirect speech is used, the words of the other person are represented without quotation marks, indicated by a complement clause that is introduced by a verb of communication such as say or answer: (3) Naar aanleiding van enkele vragen van de president van de raad Prof. Mr. Verzijl, antwoordde Fischer, dat hij het gebeurde thans betreurt. (NRC, July 6, 1950, Section: domestic news) In response to questions from the president of the board, Professor Verzijl, Fischer replied that he now regrets what happened. Semi-direct speech Related to indirect speech there is a mixed form. In the Algemene Nederlandse Spraakkunst (Haeseryn et al., 1997), a grammar of Dutch, this form is called semi-direct speech. When one reads a quotation in this form, it is only at the end of the sentence that it becomes clear that the sentence (and maybe even more) must be interpreted as speech by another person: (4) Operaties kunnen bij milde gevallen echter erg ingrijpend zijn, vindt Berghmans. (Algemeen Dagblad, May 10, 2002, Section: science) Operations can, however, be very far-reaching in less severe cases, thinks Berghmans.
When using direct speech and partial quotes journalists explicitly state – through the graphic device of quotation marks – that they assume no responsibility for the fragment between quotation marks; rather, the fragment is the responsibility of the news source (Sanders and Spooren, 1997). In examples (1) and (2) the fragments between quotation marks fall entirely under the responsibility of Sevenstar managing director Koolhof and of Smits, former CEO at the Rabobank, respectively. When another person’s speech is represented through indirect or semi-direct speech, the determination of responsibility is less clear; in these cases the journalist is at least partly responsible for the wording. To be able to distinguish between the voice of the journalist and the voice of other persons, we distinguish between two groups of speech representation. In the first group, consisting of indirect speech and semi-direct speech, it is the voice of the journalist that is foregrounded, as is the case in fragments where there is no citation of other persons at all. Subjective elements occurring in these fragments are counted as originating from the journalist. In the second group, formed by direct speech and partial quotes (together: direct quotations), the voices of other persons are explicitly
97
foregrounded. Subjective elements occurring in this direct speech and these partial quotes are considered to originate from the cited source. For examples (1) through (4) this means that thans (now) in example (3), and erg (very) in example (4), are counted as indications of speaker subjectivity, whereas absoluut (absolutely) and ons (us) in example (1), and onmogelijk (impossible) in example (2), are counted as indications of other-person subjectivity. If the informalization thesis is correct, one would expect an increase in the presence of subjective elements over time. The question arises whether the informalization thesis extends to both speaker and other-person subjectivity. For the present we expect an increase in subjectivity across the board: we expect that the presence of the main speaker of journalistic discourse has become more manifest; we also expect that the presence of other speakers in news texts has increased, and furthermore, that the speakers of the citations have become more manifest. Therefore, we expect an increase of the frequency of features of subjectivity in the speech of the journalist (Hypothesis 1), an increase of the proportion of direct quotations of other speakers (Hypothesis 2), and an increase of the features of subjectivity in these direct quotations (Hypothesis 3) over time.
3. Method The aim of this study is to test the informalization thesis for Dutch newspapers. To this end we carried out three corpus analyses, in which the frequencies of occurrence of the subjective elements in news texts were compared between two periods. The first analysis includes all subjective elements, both for speaker and other-person subjectivity; the second analysis considers only elements of speaker subjectivity (the subjective elements under the responsibility of the journalist); and the third analysis involves only the subjectivity of other persons in the text (the subjective elements in the direct quotations). The material consists of two news subcorpora, one containing newspaper texts from 1950 and 1951, and one containing newspaper texts from 2002. The texts from 1950 and 1951 were scanned (using Optical Character Recognition) from original paper copies available in three Dutch libraries;1 the 2002 texts were extracted from the international newspaper database Lexis Nexis. Table 1 describes the corpus in more detail, with an overview of the newspapers and sections included. All texts were annotated automatically for part of speech and lemma information.2 In a project funded by the CLARIN project3 (CLARIN-NL-10-016), the corpus has been made available for further scientific research through the Dutch-Flemish HLT Agency (Centrale voor Taal – en Spraaktechnologie) at the Institute for Dutch Lexicology (Leiden), under the title ‘VU Diachronic News text Corpus’ (VU DNC) via www.inl.nl. For the analysis of subjective elements we draw on operationalizations of subjectivity by Bekker (2006), Biber (2004), Scheibman (2002), Wiebe (1994) and Wiebe et al. (2005).4 The list was 1 The paper copies of Algemeen Dagblad were available at the Royal Library in The Hague; NRC Handelsblad, de Telegraaf and de Volkskrant were placed at our disposal by the library of the University of Amsterdam; Trouw was available at the library of the VU University Amsterdam. 2 For the annotation of part of speech and lemma information we made use of Tadpole, a program for automatic morpho-syntactic analysis and parsing of Dutch texts (van den Bosch et al., 2007). 3 The CLARIN project is a large-scale European research infrastructure project designed to establish an integrated and interoperable infrastructure of language resources and technologies, cf. www.clarin.eu. 4 In addition to the lexicogrammatical analysis presented here, an analysis of subjectivity at the textual level has been performed on a similar, but smaller, corpus in another study. In this analysis, discussed in (Vis et al., 2010), the texts have been analyzed for subjective coherence relations.
98
K. Vis et al. / Discourse, Context & Media 1 (2012) 95–102
Table 1 The corpus.
Table 2 Subjective elements in the model of analysis and examples.
Subcorpora
Number of words
Newspaper texts from 1950 and 1951a (originating from Algemeen Dagblad, NRCb, de Telegraaf, Trouw and de Volkskrant; spread over sections: front page, domestic news, foreign news, financial news, culture, opinion, sports, science, various) Newspaper texts from 2002 (originating from Algemeen Dagblad, NRC Handelsbladb, de Telegraaf, Trouw and de Volkskrant; spread over sections: front page, domestic news, foreign news, financial news, culture, opinion, sports, science)
931,574 words (3615 texts)
Subjective elements
Source
Examples
Pronouns (first and second person singular and plural)
H, S
ik, jij, mijn, onze (I, you, my, our)
Be, Bi1, Bi2, H, S, W1, W2
mogelijk, zeker, eigenlijk, hopelijk (possibly, definitely, actually, hopefully)
H, W1
nog, al, pas (still, already, only/just)
Intensifiers
S, W1, W2
nogal, erg, bijna, nauwelijks (quite, very, almost, hardly)
Modal verbs
Bi1, Bi2, H, S, W1, W2
kunnen, moeten, blijken, schijnen (can, must, appear to, seem to)
Cognitive verbs
Be, Bi1, Bi2, S, W2
zeggen, denken, hopen, verwachten (say, think, hope, expect)
Modal functions of imperative
H
Kom hier! (Come here!)
Exclamations
S, Wi1, Wi2
Wat mooi! (How beautiful!)
Questions
W1, W2
Hoe nu deze crisis te verklaren? (How to explain this crisis?)
Deictic elements
Be, W1
nu, hier, gisteren (now, here, yesterday)
Modal adverbials modal adverbs 971,059 words (3003 texts)
a The texts from NRC, de Telegraaf, and Trouw are from 1950. For Algemeen Dagblad and de Volkskrant, however, no original paper prints were available for 1950. Instead, texts from 1951 were chosen. b In 1950 this paper was called Nieuwe Rotterdamsche Courant (NRC). In 1970, however, it merged with Algemeen Handelsblad to form NRC Handelsblad.
completed by the descriptions of the grammatical categories of stance and modality in the grammars of Biber et al. (1999) and Haeseryn et al. (1997). Table 2 presents an overview of the analytic model, including for every element the source and a number of examples. The appendix contains a fragment of an annotated text. Vis et al. (2009) demonstrate that this model is suitable for the analysis of subjectivity as a measure of informalization: in a synchronic comparison of Dutch conversations from the Corpus of Spoken Dutch (50,000 words; Oostdijk, 2000) and Dutch newspaper texts from 2002 (55,000 words from the same newspapers and sections as used in this study) all elements in the model occur more frequently in the conversations than in the newspapers. These differences are significant,5 which demonstrates that the elements under investigation in fact have the potential to reflect a change in subjectivity. The two subcorpora were annotated automatically for subjectivity: a subjectivity lexicon was compiled with lexical entries for each of the subjective elements; all occurrences in the subcorpora of these lexical entries were marked automatically with a subjectivity code. Additionally, the fragments in direct speech and partial quotes (direct quotations) were marked with xml-tags for direct quotations, thus enabling their isolation in the corpus. This annotation was performed automatically. To be precise, all fragments between quotation marks were annotated as direct quotations. These are not limited to direct speech and partial quotes, but also include some names and titles (e.g. of plays and books). However, this was not problematic, as a sample survey showed that these formed only a small percentage of all words between quotation marks (o5%), and they contained no subjective elements. After the entire corpus had been tagged for subjectivity and direct quotation, the frequencies of the subjective elements in the two subcorpora were calculated. The frequencies of the speaker subjective elements and of the other-person subjective elements were calculated separately, as well. The differences between the corresponding frequencies in the subcorpora were statistically tested using a log likelihood test,6 which allows for comparisons of frequencies in corpora even when the studied phenomena are relatively rare. In the next section the results of these calculations are presented. For the sake of comparability the frequencies were normalized for 10,000 words.
5 The differences are significant (minimal G2 value 43.56, po 0.001), with exception of modal subjunctives, where there is no significant difference (G2(1) ¼0.07, p ¼0.79). Since this shows that modal subjunctives are not typical for conversational registers, they were not included in the model here. 6 For this test the Log-likelihood calculator was used that can be found on the web site http://ucrel.lancs.ac.uk/llwizard.html, consulted on March 17, 2010.
modal particles
Be¼ Bekker (2006), Bi1 ¼Biber et al. (1999), Bi2 ¼Biber (2004), H ¼ Haeseryn et al. (1997), S¼ Scheibman (2002), W1 ¼Wiebe (1994), W2 ¼Wiebe et al. (2005)
Table 3 Number of subjective elements in newspapers per 10,000 words, by year. Subjectivity feature
1950/1
2002
G2
P
1st pronoun sg 1st pronoun pl 2nd pronoun sg 2nd pronoun pl Modal adverbs Modal particles Intensifiers Modal verbs Cognitive verbs Exclamation marks Questions Deictic elements
19.9 31.2 10.4 0.3 25.1 98.3 49.0 169.1 87.7 4.2 10.1 41.2
61.3 29.9 28.5 0.7 24.3 142.8 66.0 165.8 111.8 3.5 14.3 40.2
2093.93 2.80 833.05 16.50 1.18 783.67 238.05 2.92 277.54 6.18 70.26 1.09
o 0.001 0.09 o0.001 o0.001 0.28 o0.001 o0.001 0.09 o0.001 o0.05 o0.001 0.30
Note: The normalized frequencies are calculated w.r.t. the entire corpus.
4. Results Hypothesis 1 assumes that newspapers from 2002 contain more subjective elements than newspapers from 1950/1. This hypothesis was tested through a diachronic comparison of the two newspaper subcorpora. First we present comparison between the newspapers without distinction between speaker subjectivity and other-person subjectivity. In this comparison all subjective elements in the subcorpora are included. The results are presented in Table 3 and Fig. 1 which provides a more visual presentation. The frequencies are normalized for 10,000 words of the entire corpus and presented pair-wise per element, the left
K. Vis et al. / Discourse, Context & Media 1 (2012) 95–102
99
Frequency, normalized for 10,000 words
250 newspapers 1950/1 newspapers 2002
200
150
100
50
er
n
ou
on pr
on pr
pr
on
ou
n
ou
n
1s
1s
tp
er
tp
so
n
si
ng ul ar 2n so d n pr pe pl on ur rs ou al on n si 2n n d gu pe la rs r on m pl ur od al al ad ve m od rb al pa rti cl e in te ns ifi er m od al co ve gn rb iti ex ve cl ve am rb at io n m ar k qu e de st io ic n tic el em en t
0
Fig. 1. Comparison of the frequencies of occurrence of subjective elements in newspapers from 1950/1 and newspapers from 2002, normalized for 10,000 words.
bars representing the frequencies in the newspapers from 1950/1 and the right bars the frequencies in the newspapers from 2002. The picture provided by Fig. 1 is quite clear: all elements that show a significant change increase over time (1st person singular pronouns, 2nd person singular pronouns, 2nd person plural pronouns, cognitive verbs, questions, intensifiers, and modal particles), the only exception being the exclamation marks (signaling both imperatives and exclamations), which decrease significantly. All other elements remain stable (pronouns 1st person plural, deictic elements, modal adverbs and modal verbs); they show a non-significant decrease. Subsequently, the newspapers from 1950/1 and 2002 were compared without the subjective elements in quotation; in other words, the only thing that was compared were the occurrences of subjective elements that fall under the responsibility of the journalist (speaker subjectivity). Table 4 and Fig. 2 present the frequencies, again pair-wise and normalized to 10,000 words of the entire corpus. Compared to the first diachronic comparison (Fig. 1), a number of differences may be observed. Two elements that increased significantly in the first comparison do not show a significant change in this second comparison: 1st person singular pronouns and 2nd person plural pronouns. For four elements, the first comparison showed no change, whereas in this comparison they have decreased significantly: pronouns 1st person plural, deictic elements, modal adverbs and modal verbs. The other elements still show an increase, as they did in Fig. 1. Of the elements that show an increase, only one shows a change that is stronger in this comparison than in the first comparison, namely the exclamation marks. For the other elements (2nd person singular pronouns, modal particles, intensifiers, cognitive verbs and questions) there is still an increase, but the difference between 1950/1 and 2002 is smaller when the direct quotations are excluded. Finally, Table 5 and Fig. 3 show the comparison between 1950/1 and 2002 of the subjective elements within the direct quotations.7
7 In this case the frequencies are normalized for 10,000 words of the direct quotations.
Table 4 Number of speaker subjective elements in newspapers per 10,000 words, by year. Subjectivity feature
1950/1
2002
G2
P
1st pronoun sg 1st pronoun pl 2nd pronoun sg 2nd pronoun pl Modal adverbs Modal particles Intensifiers Modal verbs Cognitive verbs Exclamation marks Questions Deictic elements
12.6 24.5 6.9 0.1 22.2 88.0 43.5 150.3 77.9 3.2 8.1 37.4
12.0 7.6 11.2 0.2 16.6 109.2 49.6 123.3 84.9 1.7 9.6 30.8
1.27 901.10 101.32 0.67 77.81 281.88 37.59 254.68 28.85 45.19 11.83 61.79
0.26 o 0.001 o 0.001 0.41 o 0.001 o 0.001 o 0.001 o 0.001 o 0.001 o 0.001 o 0.001 o 0.001
Note: The normalized frequencies are calculated w.r.t. the entire corpus.
Hypothesis 3 predicts that the subjective elements in the direct quotations increase over time. This is confirmed by Table 5 and Fig. 3: almost all elements show a statistically significant increase. Only the pronouns 2nd person plural and the exclamations marks do not change significantly. None of the elements decrease significantly between 1950/1 and 2002. Hypothesis 2 concerns the amount of quoted speech in relation to the journalist’s text. The expectation is that the proportion of quoted speech has increased. To test this hypothesis the proportions of direct quotations (cf. footnote 8) in newspapers from 1950/1 and 2002 were compared. The results in Table 6 show that this proportion increased between 1950/1 and 2002: the relative number of words in direct quotation was pronouns significantly higher in 2002 than in 1950/1.
5. Conclusion and discussion The assumption underlying this study is that subjectivity (including stance) is a marker of informalization expressed through linguistic elements detailed in the list in Section 3. Previous research (Vis et al., 2009) has demonstrated that the
100
K. Vis et al. / Discourse, Context & Media 1 (2012) 95–102
Frequency, normalized for 10,000 words
250 newspapers 1950/1 newspapers 2002
200
150
100
50
t
n
em en
tio es
ar
el tic ic
de
m n io
at
qu
ve
k
rb
rb
am cl
co
gn
iti
ve
al
m od
ns
ifi
ve
er
e cl te in
m od
al
pa
rti
ve ad
pl
al
ou
n
ex
d
2n
d
m od
rs pe
rb
al ur
ar ul
on
ng si
on
pe
rs
er 2n
on pr
pr
on
ou
n
ou on
n pr
ou on pr
1s n
1s
tp
er
tp
so
n
so
si
n
ng
pl
ul
ur
ar
al
0
Fig. 2. Comparison of the frequencies of occurrence of speaker subjective elements in newspapers from 1950/1 and newspapers from 2002, normalized for 10,000 words.
Table 5 Number of other-person subjectivity features in direct quotations per 10,000 words of direct quotations, by year. Subjectivity feature
1950/1
2002
G2
P
1st pronoun sg 1st pronoun pl 2nd pronoun sg 2nd pronoun pl Modal adverbs Modal particles Intensifiers Modal verbs Cognitive verbs Exclamation marks Questions Deictic elements
63.6 58.4 30.4 1.6 25.3 90.1 47.4 162.6 84.7 8.7 17.4 32.4
234.5 106.4 82.1 2.7 36.9 159.8 77.9 202.7 127.8 8.6 22.7 44.8
1383.21 193.73 332.11 3.71 29.94 269.46 101.10 61.11 121.06 0.01 9.69 27.88
o 0.001 o 0.001 o 0.001 0.05 o 0.001 o 0.001 o 0.001 o 0.001 o 0.001 0.92 o 0.01 o 0.001
Note: The normalized frequencies are calculated w.r.t. the direct quotations.
model of analysis is suitable for the study of subjectivity as a measure of informalization: all elements occur more frequently in conversations than in news, indicating that the elements are typical for informal discourse. The results of the diachronic comparisons indicate that overall subjectivity in Dutch newspapers has increased. However, if a distinction is made between speaker subjectivity and other-person subjectivity, the results show that it is not primarily journalists who express themselves and their private states more in 2002 than in 1950/1 (Table 4). Rather, the change in subjectivity lies primarily in the direct quotations: the contribution that direct quotations make in the newspaper increases (cf. Table 6). Furthermore, the frequency of occurrence within direct quotations increases for almost all subjective elements (cf. Table 5). What does this mean for the informalization thesis? Is there a change in Dutch newspapers towards a more informal and conversational style or not? Overall, when no distinction is made between the subjectivity expressed by journalists and the subjectivity expressed by other persons (in direct quotations), the newspapers do seem to have become more subjective, and hence, appear to provide evidence for the informalization thesis (cf.
Table 3). This accords with the findings of previous studies (Fairclough, 1994; Biber and Finegan, 2001; Hundt and Mair, 1999; Biber, 2004; Steen, 2003; Cotter, 2003; Pearce, 2005). However, the separate comparisons for speaker subjectivity and other-person subjectivity in this study make it clear that this is an over-simplified conclusion. In fact, it is not the case that newspapers overall have become more subjective: only the quoted speech embedded in the newspaper texts shows a clear increase in subjectivity. Furthermore, the proportion of quoted speech has increased significantly. These results raise the question to what extent the findings of previous researchers on informalization in news have been affected by the fact that the direct quotations are not treated separately in the analysis (e.g. Biber, 2004; Biber and Finegan, 2001; Hundt and Mair, 1999). The fact that the use of quoted speech has increased significantly in this study, and that this quoted speech has become more subjective, leads to the conclusion that informalization does not occur in the way that other researchers have claimed, namely that journalists are using a more oral style by expressing their attitudes and opinions. Instead, the oral style is brought to the text in a different way, by citing conversation through direct speech representation. In other words, the news sources speak in the news text and in this way express their own, other-person, subjectivity. We see several possibilities for further research. The analyses presented here compare one newspaper subcorpus to a newspaper subcorpus from another period, and do not take into account the different kinds of genres within the newspaper and the different newspapers that the subcorpora contain. Previous research on the development of Dutch newspapers seems to indicate that changes might be different for the different newspapers. For instance, Wolf (2007) and Broersma (2007) argue that Dutch newspapers (such as the ones used in this study) developed somewhat differently over the last century. In a follow-up study comparisons between newspapers and genres will aim to provide more insight into the changes that have taken place; not only may individual newspapers have their own characteristic development, but changes over time may also be different for distinguished genres (for example, hard news texts compared with more opinion based texts) (Vis et al., in preparation). Also, the
K. Vis et al. / Discourse, Context & Media 1 (2012) 95–102
101
250 Frequency, normalized for 10,000 words
newspapers 1950/1 newspapers 2002
200
150
100
50
co ve gn rb iti ex v cl e am ve rb at io n m ar k qu es de tio ic tic n el em en t
er
al
m od
cl e
ou
n
in
te
ns ifi
b
rt i
al m od
2n
d
pa
ad al
m od
rs pe
ve r
al ur
ar
pl
ul si
on rs
pe pr
on
n ou
pr
on
on
ng
pl so n
2n
d
1s tp
n
er ou
1s tp
on
n pr
ou on pr
er
so n
si n
gu
ur
la
r
al
0
Fig. 3. Comparison of the frequency of occurrence of other-person subjective elements in direct quotation from 1950/1 and 2002, normalized for 10,000 words of direct quotation.
Table 6 Proportions of words in direct quotations per 10,000 words of direct quotations, by year. 1950/1
2002
G2
P
1153
2101
26602.69
o 0.0001
increasing use of direct quotations over time in the Dutch newspapers could be interpreted as an expression of changes in professional practices in journalism, in this case a changing view
about the distinction between facts and opinions (Sanders, 2009). This issue deserves further research, for example by analyzing the style guides of the newspapers included in this study. What has become clear from our results is that in a study of subjectivity in news discourse it is crucial to distinguish between the different voices in the newspaper, i.e. the voice of the journalist and of other persons in the text. Therefore, a comparison of subjectivity between newspapers and between genres in the news necessarily is a three-level comparison: first for both voices, second for the journalist’s voice only, and third for the voices of the quoted sources (Vis et al., in preparation).
Appendix A. Example of a fragment of an annotated text Fragment from an article from Algemeen Dagblad (May 3, 2002, Section: domestic news) Original text: Ook bij Sevenstar wist men niets van de smokkel. ‘Dit is voor ons absoluut niet te controleren’, zegt Sevenstar-directeur Koolhof. (At Sevenstar the smuggling was not known either. ‘It is absolutely not possible for us to check this’, says Sevenstar managing director Koolhof.) Fragment o sent id ¼‘‘26’’4 o br/4 opau ref¼‘‘Abi3.txt.26’’ s¼‘‘UNKNOWN’’ 4 opw ref¼‘‘Abi3.txt.26.1’’ pos¼‘‘BW()’’ lem¼‘‘ook’’ subj¼‘‘y’’ subjcat ¼‘‘modal’’ subjscat¼‘‘part’’ 4Ook o/pw4 opw ref¼‘‘Abi3.txt.26.2’’ pos¼‘‘VZ(init)’’ lem¼ ‘‘bij’’4bijo/pw 4 opw ref¼‘‘Abi3.txt.26.3’’ pos¼‘‘N(eigen,ev,basis,zijd,stan)’’ lem ¼‘‘Sevenstar’’ 4Sevenstaro/pw4 opw ref¼‘‘Abi3.txt.26.4’’ pos¼‘‘WW(pv,verl,ev)’’ lem¼‘‘weten’’ subj¼‘‘y’’ subjcat¼‘‘verbcogn’’ subjtense¼‘‘past’’4wisto/pw4 opw ref¼‘‘Abi3.txt.26.5’’ pos¼‘‘VNW(pers,pron,nomin,red,3p,ev,masc)’’ lem ¼‘‘men’’4men o/pw4 opw ref¼‘‘Abi3.txt.26.6’’ pos¼‘‘VNW(onbep,pron,stan,vol,3o,ev)’’ lem¼‘‘niets’’ 4nietso/pw 4 opw ref¼‘‘Abi3.txt.26.7’’ pos¼‘‘VZ(init)’’ lem¼ ‘‘van’’4vano/pw 4 opw ref¼‘‘Abi3.txt.26.8’’ pos¼‘‘LID(bep,stan,rest)’’ lem¼‘‘de’’ 4de o/pw4 opw ref¼‘‘Abi3.txt.26.9’’ pos¼‘‘N(soort,ev,basis,zijd,stan)’’ lem ¼‘‘smokkel’’4smokkel o/pw4 opl ref¼‘‘Abi3.txt.26.10’’ pos¼‘‘LET()’’ lem¼‘‘.’’ 4.o/pl 4 o/pau 4 o /sent4 o sent id ¼‘‘27’’4 opau ref¼‘‘Abi3.txt.27’’ s¼‘‘UNKNOWN’’ 4 o DS4
102
K. Vis et al. / Discourse, Context & Media 1 (2012) 95–102
Fragment opw ref¼‘‘Abi3.txt.27.1’’ pos¼‘‘LET()’’ lem ¼‘‘‘‘‘ 4‘o/pw4 opw ref¼‘‘Abi3.txt.27.2’’ pos¼‘‘VNW(aanw,pron,stan,vol,3o,ev)’’ lem ¼‘‘dit’’4 omrw type¼‘‘met’’4 Dito/mrw 4 o/pw4 opw ref¼‘‘Abi3.txt.27.3’’ pos¼‘‘WW(pv,tgw,ev)’’ lem¼‘‘zijn’’ subjtense¼‘‘pres’’ 4iso/pw4 opw ref¼‘‘Abi3.txt.27.4’’ pos¼‘‘VZ(init)’’ lem¼ ‘‘voor’’4 omrw type¼‘‘met’’4voor o/mrw4 o/pw4 opw ref¼‘‘Abi3.txt.27.5’’ pos¼‘‘VNW(pr,pron,obl,vol,1,mv)’’ lem¼‘‘ons’’ subj¼ ‘‘y’’ subjcat¼‘‘1stp’’ subjscat¼ ‘‘pl’’4onso/pw4 opw ref¼‘‘Abi3.txt.27.6’’ pos¼‘‘ADJ(vrij,basis,zonder)’’ lem¼‘‘absoluut’’4absoluuto/pw 4 opw ref¼‘‘Abi3.txt.27.7’’ pos¼‘‘BW()’’ lem ¼‘‘niet’’4nieto/pw 4 opw ref¼‘‘Abi3.txt.27.8’’ pos¼‘‘VZ(init)’’ lem¼ ‘‘te’’4te o/pw4 opw ref¼‘‘Abi3.txt.27.9’’ pos¼‘‘WW(inf,vrij,zonder)’’ lem ¼‘‘controleren’’4controlereno/pw4 opw ref¼‘‘Abi3.txt.27.10’’ pos ¼‘‘LET()’’ lem ¼‘‘‘‘‘ 4‘o/pw4 o/DS 4 opw ref¼‘‘Abi3.txt.27.11’’ pos¼‘‘LET()’’ lem¼‘‘,’’ 4,o/pw4 opw ref¼‘‘Abi3.txt.27.12’’ pos¼‘‘WW(pv,tgw,met-t)’’ lem¼ ‘‘zeggen’’ subj¼‘‘y’’ subjcat¼ ‘‘verbcogn’’ subjtense¼‘‘pres’’ 4zegto/pw4 opw ref¼‘‘Abi3.txt.27.13’’ pos¼‘‘N(soort,ev,basis,zijd,stan)’’ lem ¼‘‘sevenstar-directeur’’4Sevenstar-directeur o/pw4 opw ref¼‘‘Abi3.txt.27.14’’ pos¼‘‘N(eigen,ev,basis,zijd,stan)’’ lem ¼‘‘Koolhof’’4 Koolhofo/pw4 o pl ref¼‘‘Abi3.txt.27.15’’ pos¼‘‘LET()’’ lem¼‘‘.’’ 4.o/pl 4 o/pau4 o/sent 4 Notes: The subjective elements are underlined. Subj ¼’’y’’ stands for ‘subjective element’. oDS4 stands for direct speech.
References Biber, Douglas, Finegan, Edward, 2001. Diachronic relations among speech-based and written registers in English. In: Conrad, Susan, Biber, Douglas (Eds.), Variation in English: Multi-Dimensional studies. Longman, London. Biber, Douglas, 2004. Historical patterns for the grammatical marking of stance: a cross-register comparison. Journal of Historical Pragmatics 5, 107–136. Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan, Finegan, Edward, 1999. The Longman Grammar of Spoken and Written English. Longman, London. Biber, Douglas, 1988. Variation Across Speech and Writing. Cambridge University Press, Cambridge. Biber, Douglas, 1995. Dimensions of Register variation: A Cross-Linguistic Comparison. Cambridge University Press, Cambridge. Broersma, Marcel, 2007. Form, style and journalistic strategies. An introduction. In: Broersma, Marcel (Ed.), Form and Style in Journalism. European Newspapers and the Representation of News, 1880–2005. Paris and Dudley, Leuven, pp. ix–xxix. Bekker, Birgit, 2006. De feiten verdraaid. Over tekstvolgorde, talige markering en sprekerbetrokkenheid. Dissertation. University of Tilburg, Tilburg, The Netherlands. Chafe, Wallace, 1982. Integration and involvement in speaking, writing, and oral literature. In: Tannen, Deborah (Ed.), Spoken and Written Language: Exploring Orality and Literacy. Ablex, Norwood, N.J., pp. 35–53. Cotter, Colleen, 2003. Prescription and practice. Motivations behind change in news discourse. Journal of Historical Pragmatics 4, 45–74. Clark, Herbert, 1996. Using Language. Cambridge University Press, Cambridge. Fairclough, Norman, 1994. Conversationalization of public discourse and the authority of the consumer. In: Keat, Russell, Whiteley, Nigel, Abercrombie, Nicholas (Eds.), The Authority of the Consumer. Routledge, London, pp. 253–268. Fairclough, Norman, Mauranen, Anna, 1997. The conversationalisation of political discourse. Belgian Journal of Linguistics 11, 89–119. Hundt, Marianne, Mair, Christian, 1999. ‘‘Agile’’ and ‘‘uptight’’ genres: the corpusbased approach to language change in progress. International Journal of Corpus Linguistics 4, 221–242. ¨ Høyer, Svennik, Pottker, Horst, 2005. Diffusion of the News Paradigm 1850–2000. ¨ Nordicom, Goteborg. Haeseryn, Walter, Romijn, Kirsten, Geerts, Guido, de Rooij, Jaap, Van den Toorn, Maarten, 1997. ANS. Algemene Nederlandse Spraakkunst, second ed. Martinus Nijhoff, Groningen. ¨ Kubler, Hans-Dieter, 1985. Ende der Schriftkultur? Anmerkungen zu einem wissenschaftlichen Modethema. Wirkendes Wort 35, 338–362. Langacker, Ronald, 1990. Subjectification. Cognitive Linguistics 1, 5–38. Lyons, John, 1994. Subjecthood and subjectivity. In: Yaguello, Marina (Ed.), Subjecthood and Subjectivity: Proceedings of the Colloquium ‘The Status of the Subject in Linguistic Theory’. Ophrys, Paris, pp. 9–17. Mair, Christian, 2006. Twentieth-Century English. Cambridge University Press, Cambridge. Oostdijk, Nelleke, 2000. The spoken dutch corpus project. The ELRA Newsletter 5, 4–8. Pearce, Michael, 2005. Informalization in UK party election broadcasts: 1966–97. Language & Literature 14, 65–90. Steen, Gerard, 2003. Conversationalization in discourse: Stylistic changes in editorials of The Times between 1950 and 2000. In: Lagerwerf, Luuk, Spooren,
Wilbert, Degand, Liesbeth (Eds.), Determination of Information and Tenor in Texts: Multidisciplinary Approaches to Discourse. Nodus Publikationen, ¨ Munster, pp. 115–124. Steger, Hugo, 1989. Sprache im Wandel. In: Benz, Wolfgang (Ed.), Die Geschichte der Bundesrepublik Deutschland, vol. 3: Kultur. Fischer, Frankfurt am Main. Scheibman, Joanne, 2002. Point of view and grammar: Structural Patterns of Subjectivity in American English Conversation. John Benjamins, Amsterdam. Sanders, Jose´, Spooren, Wilbert, 1997. Subjectivity, perspectivization, and modality from a cognitive linguistic point of view. In: Liebert, Wolf-Andreas, Redeker, Gisela, Waugh, Linda (Eds.), Discourse and Perspective in Cognitive Linguistics. John Benjamins, Amsterdam, pp. 85–112. Sanders, Jose´, 2009. De verdeling van verantwoordelijkheid tussen journalist en nieuwsbron. Vorm en functie van citaatmengvormen in journalistieke genres. Tijdschrift voor Taalbeheersing 31, 1–17. van den Bosch, Antal, Busser, Bertjan, Daelemans, Walter, Canisius, Sander, 2007. An efficient memory-based morphosyntactic tagger and parser for Dutch. In: Van Eynde, Frank, Dirix, Peter, Schuurman, Ineke, Vandeghinste, Vincent (Eds.). Selected Papers of the 17th Computational Linguistics in the Netherlands Meeting, Leuven, Belgium, pp. 99–114. Vis, Kirsten, 2011. Subjectivity in news discourse. A corpus linguistic analysis of informalization. Dissertation. VU University, Amsterdam, The Netherlands. Vis, Kirsten, Spooren, Wilbert, Sanders, Jose´, 2010. Using RST to analyze subjectivity in text and talk. In: Tabakowska, Elzbieta, Choinski, Michal, Wiraszka, Lukasz (Eds.), Cognitive Linguistics in Action: From Theory to Application and Back. Mouton de Gruyter, Berlin, pp. 293–316. Vis, Kirsten, Sanders, Jose´, Spooren, Wilbert, 2009. Subjectiviteit door de jaren heen: conversationalisatie in journalistieke teksten. (Subjectivity over the years: conversationalization in journalistic texts). In: Spooren, Wilbert, Onrust, Margreet, Sanders, Jose´ (Eds.), Studies in Taalbeheersing 3 (Studies in Language Use 3). Van Gorcum, Assen, pp. 405–418. Vis, Kirsten, Sanders, Jose´, Spooren, Wilbert, in preparation. Wiebe, Janyce, Wilson, Theresa, Cardie, Claire, 2005. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation 39, 165–210. Wiebe, Janyce, 1994. Tracking point of view in narrative. Computational Linguistics 20, 233–287. ¨ Wolf, Mariette, 2007. An Anglo-American newspaper in Holland. Form and style of De Telegraaf (1893–1940). In: Broersma, Marcel (Ed.), Form and Style in Journalism. European Newspapers and the Representation of News, 1880–2005. Paris and Dudley, Leuven.
Kirsten Vis is post-doc researcher at the Faculty of Humanities, Utrecht University. She has published on text linguistics, specializing in subjectivity in news discourse. Jose´ Sanders is associate professor of Communication and Information Sciences at the Centre for Language Studies, Radboud University, Nijmegen. At the time of writing, Wilbert Spooren was professor of Language and Communication at the Faculty of Arts, VU University, Amsterdam. Presently he holds the chair of Language Use and Discourse Studies at the Centre for Language Studies, Radboud University, Nijmegen. Wilbert Spooren and Jose´ Sanders both published extensively about text linguistics, specializing in coherence and subjectivity.