On the distribution of noun-phrase types in English clause-structure

On the distribution of noun-phrase types in English clause-structure

(1971) 281-293, Q North-Hollmd Parbkshrng Company L”FOLto26ta reproduced In any formwithoutnrimn pwmLs&m from the publisher ON THE DISTRIBUTION OF ...

555KB Sizes 13 Downloads 113 Views

(1971) 281-293, Q North-Hollmd Parbkshrng Company L”FOLto26ta reproduced In any formwithoutnrimn pwmLs&m from the publisher

ON THE

DISTRIBUTION

OF NOUN-PHRASE TYPES IN ENGLISH CLAUSE-STRUCTURE F

0

G

1)

A M AARTS

IN1 RODUCTION

The mvestrgatron reported m this p&per 1s based on an exammatron of 6 corpus of approxrmately ‘72000 words of present-day English This corpus, which forms part of the rues of the Survey of Engtib Usage at UAversrty College London, comprises 14 texts of ca 5 080 words each, whrch may be drvrded mto four catcgznes (1) hght frctron. texts 6 1, 6 2, 6 3 and 6 4, (2) screntrfrc wntmg texts 8a 1,8a 3 and 8a 4, (3) informal speech texts 5b 1,5b16, and Sic 11; (4) formalspoken and wntten Enghsh texts 5b 2 and 5b 51, 8b 1 and 8ab 2. The question we are interested m IS whether rt IS possrble to demonstrate non-randomness m the &stnbutron of noun-phrase types in Enghsh clause-structure We shall assume that, to a large extent, this distnbution IS determmed not only by the functron of the noun-phrase m the clause, but also stylistically wth respect to the variety of Enghsh More specrfically, we shall try to find evrdence for the hypotheses that there IS a correlatron between subjectexponents and structural ‘hghtness’ on the one hand and a very strong tendency for non-subject-exponents to be reahzed by structurally ‘heavy’ noun-phrase types on the other The drstinctron drawn here between ‘hght’ and ‘heavy’ nounphrase types is to be interpreted as fc!lows By ‘hghr’ Items we shall here ,mderstand* (1) pronouns, (2) names, (3) nouns, neither prenor postm&fred, (4) nouns, premodrfred by determmers only -r) This paper was wntten dunng a stay at Umverslty College London (1969-1970). which was partly supported by a grant from the Nlels Stensen Foundation m Amsterdam I am very much obbged to Professor Randolph Qmrk for permmuon to use the flies of the Survey of Engbsh Usage and especially for his many helpful suggestions and comments

281

282

F. G. A

M. AARTS Taem

I

-”

c Prepositional

b Dlrffit ObJect

a Subject

aclJunct

Pewma proTEXTS

nOUllS

Other pro-

Names

Pemmdl

pronouns

nouns

Other Names pronouns

Personal Other Names pro- pronouns nouns 50 51 34 23

16 36 15 16

20 22 29 22

7 4 5

34 24 22

6 2 41

5 5 -

30 J9 4ir

39 38 29

54 23 27

1 3

32 30 15 10

37 36 27 25

25 6 13 58

38

376

394

348

61 62 63 64

329 508 271 232

36 83 48 35

94 164 72 71

55 76 41 40

16 51 10 20

3 7 4 10

0a 1 8a 3 8a4

57 117 42

65 73 75

4 1 25

4 3

2 4 2

-

5b 1 5b 16 SIC 11

432 520 643

90 97 a0

23 26 22

54 44 72

29 47 51

Sb2 5b51 8b 1 8ab 2

360 346 163 154

126 101 59 60

16 8 60 25

43 22 25 22

36 37 25 19

-

Total

4174

1036

611

501

357

By ‘heavy’ items we shall understand’ (a) All other premodfied noun-phrases, I e those premodlfied by (1)adjective(s), (2) gemtlve, (3) noun, (4) ,idjective + noun, (5) genltive + noun (b) All postmo&fled noun-phrases, i.e those postmo&fied Sy (1) preposltmnal phrase(s), (2) preposlilonal phrase(s) + clause, (3) prepositional phrase + non-fmrte clause, (4) non-fmite clause, (5) relative clause. 1

PRONOUNS

AND NAM&i

Table 1hstsall the pronouns and names that occur in the corpus as exponents of the vanous places m clause-structnre (columns a-f) 2) ‘) The s;rammatlcal terms m tlus pqzer are beemgused wth the values gwen to them m R W Zandvoort, A handbook of Englash grammar, 1ltb ed , Gronmngen. I%9

NOUN-PHRASE

d Indnect

TYPES

Object

Perxmal pronouns

Other prorlO”IlS

12 I1 13

2

Names

2

Personal pronoun.5 2 6

-

-

2

2

-

2 12 8 5

a

12

aa

3

-

Other

f Predlcatwe

Names

6 6

3

-

-

-

11 3

-

10 11

-

54

adJmICt -

Other pronouns

-

-

-

13

Personal proWJ”llS

proncjons

-

283

CLAUSE-STRUCTURE

e Nommal part of predate

-

-

IN

2

-

-

-

2 -

11

5

As appears from column a, all 14 texts have a large number of personal pronouns at S As exponents of non-S they decrease markedly as we move from left 1onght -their number IS considerably smaher in columns b and c, almost neghgble m columns d and e and they are totally lacking m column f Other pronouns and names are also more frequent at S than elsewhere, but the ratio of S-exponents to non-S-exponents is different from that of the personal pronouns Table 1, then, contains overwhelmmg evrdence to show that structurally ‘hght’ items tend to be found at S rather than elsewhere m the clause As table 2 shows, m thrs corpus pronouns and names at S are in fact 2 6 times as frequent as those at non-S There is more evrdence to support our hypothesrs Before going on to examme it, a few comments must be made on the figures m table 1.

Names -

-

264

fi. G. A. M. AARTS TABLE 2 Subject

Non-subject

4174 1036 611 5821

978 813 402 2193

Personal pronouns Other pronouns Names

The question to be answered here IS whether there IS any srgnificant relation between the four categones of texts al Edthe distribution of the figures m table 1 for S (column a) and non-S (columns b-f). For each of the three kinds of exponents (personal pronouns, other pronouns and names) this was examined by applying the ~2 test to the following 2 x 2 tabless) (table 3) TABLET PersOnal pronouns 1

Texts6 l-64 Texts 5b 1-S lc 11

S 1340 1595

Texts 8a l-8a 4 Texts 5b 2-8ab 2

S 216 1013

Other pronouns 3

non-S 421 309

S 202 275

nonS 23 225

S 213 346

2

non-S !9b 251

Names 5 non-S 127 117

S 341 71

4

6 non-S 91 265

S 30 109

non-s 49 109

The X*test shows that of the above tables 1,4 and 5 are statrstically highly significant (fi > 0 OOl), 2 is srgmficant ($ > 0 01) and 3 and 6 are insrgmfmant (fi < 0 025) Lmgmstically speaking this means that there IS a strong assouatron between personal pronouns and ‘sub;ectness’ in all four categories of texts (tables 1 and 2) Since th? drstnbutron in tables 3 and 6 IS not SI#ficant, the same 8) For help wltb statlstlcal problems I am wry grateful to Mar Wright of the Computer Centre of Umversky College London a;ld to Mrs CshrolmeBott

NOUN-PHRASE

TYPES

IN CLAUSE-STKUCTURE

concluaon cannot be drawn for other pronouns and names. The former, according to table 4, are typrcal S-exponents in textcategones 2 and 4 (Screntrfic wnting and Formal spoken and wntten Enghsh), the latter, accordmg to table 5, m text-category 1 (hght fictron) only One final comment should be made on the scrence texts m tlus connection. Not only do they have a stnkmgly low number of personal (and other) pronouns at non-S (indeed there are none at all in columns d and f) and an even smaller number of names (none in columns b, d, e and f), they also have comparatrvely few personal pronouns as sublect There IS some evrdence to show that one of the reasons for thus may be that the sublect m thrs type of text tends to avoid pronominahzatron E&her of two thmgs may happen The noun may snnply be repeated as m the followmg examples From the lower part of the mtd- and hmd-bram arose all the cramal nerves except the olfactory and optx These net-m follow the same plan ab those of gnathostomes (aa 1) The phenyl radxals are capable mtibked by the presence of

of effectmg

arylatlon,

and fhe nrykakon w @a 4)

or else the noun IS replaced by an eqmvalent. LMle IS known about the condltlon m sea lampreys, where blood IS probably hypotomc to the sea When 111the river the anrmals must deal wzth the tendency for water to flow 7 @a 1)

These examples of course prove httle or nothmg, but rt nught well be worth mvestrgatmg this phenomenon to see whether this IS a characteristic feature of screntific English 2. PBZEMODIFIED AND POSTMODIFIED NOUN-PHRASES Table 4 hsts the vanous premodrfiers (includmg zero) that occur in the corpus unth the non-pronommal exponents of S and non-S If we &stinguish between ‘hght’ items (columns a-b) on the one hand and ‘heavy’ items (columns c-i) on the other, and combine the figures in columns a and b with those m table 2

F

236

G A M AARTS

TABLE 4 ---

SubJect Duect obJwt Preposltlonal adJunct Induect ObJeCt Nommal part of pmdlcate Pred1cat IV0 adJUUCt

Lo + head

b Determmers + head

C ladJectwe

d 2adJechves

-Ihead

-Ihead

168 203

760 615

283 307

31 59

2 3

26 11

457 3 44

1090 6 I51

728 1 176

93 18

6 -

4

4

8

3

-

we get the following distnbutron (table 5)

e 3 (or more) adJect1ves + head

1

f Gemtwe + head

h Ad]&:ttve + noun -Ihead

i Gemttve + noun -Ihead

100 71

7 12

-

31 1 10

170 1 34

30 4

I

1

.% Noun + head

-

of ‘light’ Items at S and non-S

TABLE 5 J?ronck~+ames

& Determmer

+ head

SU ,Ject NWI-subject --

5821 2193

928 2560

The ~2 test shows that thus dtstnbutron 1s highly significant (p > 0.001) The same IS true for the tables of each of the four groups of texts separately (table 6) rt 1s evrdent, then, from the above, that wrthm the category of the structurally ‘light Items, we must distmgmsh between pronouns and names on the one hand and nouns (plus or minus determmers) on the other, the former bemg more typmal exponents of S than the latter. Table 4 requires httle further commem It 511lbe noticed that the premodrfred noun-phrases m columns d-i represent only a mmonty of those m the whole of the corpus and that the frequency of thar occurrence decreases with their complexrty, So far we have only exanmed structuraliy ‘light’ exponents, that

1 1 2

-

NOUN-PHRASE

TYPES

IN

287

CLAUSE-STRUCTURE

TABLE 6

-

Lqht fxtton Pronouns/names Subject Non-subject

f Determmer + head

1943 754

277 928

Sctentlftc wntmg Pronouns/names Subyxt Non-sublect

f Determmer + head

459 163

261 466

Informal speech

Subject Non-sub]&

Pronouns/names

f Determmer + head

1941 677

123 492

Formal spoken and wntten Enghsh iionouns/names Sublect Non-sub@

1478 599

f Determmer + head 267 674

ISpronouns, names and nouns premodfied by zero and determmers Thrs has gone some way to provnlmg evidence for our hypotheses, but we must also take account of the structurally ‘heavy’ Items, that ISto say the other premo&fied noun-phrases m table 4 (columns c-i) and all postmodrfred noun-phrases The latter are cont,uned m table 7 In table 8 the figures of table 7 have heen broken down accordmg to textual category Thus the 419 subject noun-phrases that are postmodified by one preposrtional phrase are &stnbuted over the four textual categories as indicated m the four top cells of column a Perhaps it IS a httle dangerous to attempt to draw far-reachmg conclusions from a styhstic point of vrew It 1s evrdent, however, from the perfectly regular drstnbutron of the figures m table 8, that post-mtified noun-phrases, m a’llfour text-groups, typmally appear m functions other than subject. Moreover, categones 2 and 4 (Screntrficwriting and Formal spoken and written Engbsh) tend to have a larger number of postmodrfied noun-phrases than the Fmtron

208

F. G A

a Noun + 1 prep phra%

Sub@ Dmct obyxt Prcpos~tlonal adjunct Induect object Nommal part of predwate Predxatwe adJuUCt

M. AARTS

b c d Noun Noun Noun + f + 1 prep 1 prep 2 prep phrase phrase phrases + + clause nonfmte clause

h f e g Noun Noun Noun Noun + + + -I2 prep 3 prep non- relatwe phrasea phrasce fuute clause clause + clause

4IO 411

14 45

16 18

74 62

1 8

10 14

58 79

128

7’09 1

67 -

30 -

133 -

14 -

26 -

98 -

249

196

26

11

31

9

10

36

92

-

-

5

2

TABLE 8

a Noun + 1 prep phrase

t 8 4

b c d e f Noun Noun Noun Noun Noun + -t+ + + 1 prep 1 prep 2 prep 2 prop 3 prep phrase phrase phrasesphrasesphrasea + + + clause nonclause

h g Noun Now + + non- relative fnnte clause clause

Light fiction Sclentlfic wr~tmg Informal speech Formal spoken and written Enghsh

59 168 49

134

1

2

23

1

5

13

58

tght

324

28

15

30

4

5

58

105

327 235

32 Xi

32 5

9r 28

14 2

30 3

82 35

101 96

436

53

7

71

11

12

38

204

fictxon

Sclent3fic wntmg I g Informalspeech 3 Formal spoken 8, and wnttan Eaghsh

1 6 6

4 10 -

9 35 7

-

5 -

9 30 6

17 30 23

NOUN-PHRASE

TYPES

IN CLAUSE-STRUCTURE

289

and Informal speech groups This 1strue of both the top and bottom halves of table 8 What does not become clear from table 8 is the correlatronbetween text-vanety and degree of structural complexrty Let us confine ourselves to one example Accorchng to column h the Fictron texts contain 105 noun-phrases m non-sublect functions postmochfred by a relative clause and the Science texts 101 The numerical difference is neghgrble. However, an exammatlon of the texts themselves shows that the novels have a preference for fairly sample nounphrases of thrs type, whereas screntrststend to use very complex structures hke the followmg It was this cncumstance that led Fowler to mtroduce the name ‘co-operatrve tram&on’, whxh emphasrses that the mteractronmust be ‘self-helpmg’m the senseof mcreasmgm importancewrththe progressof the changeIt 1s tendmgto promote (sa 3)

that the reactroncould be extendedfor the arylarronof a sohd aromatrccompoundArH by usmg an orgamc solvent (chloroformor carbon tetrachlonde)wluch ISrelatrvelymert towardsthe radrcaisr~olved, and whrch,whenIt does undergoreactronwith themto a smallextent,formscompoundswhrchare easdyseparatedfrom the desued substrtutron productArAr’ Pa 4) Grieve and Hey also showed

3. ‘LIGHT’ AND ‘HEAVY’ ITEMS COMPARED Tables 1, 4 and 7 account for the chstnbutron of all noun-phrase typps m the whole of the corpus Table 9 combmes these figures. TABLE

9

‘Light’ Pz. nouns/ names

All functions As subjects As complements or m adjuncts

8014 5821 2193

b f Determlner + head

c Nouns premodlfied

‘Heavy d Nouns postmodlfied

by 1 &]I% twe

by 1 prep phrase

e Nouns otherww pre- or postmodlfled

Total

3480 928 2560

!494 283 i211

1732 410 1322

2233 456 1777

16961 7898 9063

290

e

F. G. A. M. AARTS

distmgmshmg between ‘light’ noun-phrase types in columns a-b and ‘heavy’ ones in columns c-e In a more srmphfred form the data in table 9 m&y be presented m the following 2 x 2 table (table 10) TABLE

Subjects

Non-subjects

10

‘Llght’ Items

‘Heavy’ items

6749 4753

1149 4310

The xs test agun shows that this drstnbutron 1s hrghly srgmfrcant ($ : 0 001) here as well as III the case of the tables for each of the four text genres m table I1 TABLE 11

Light fiction

Sublects Non-subjects

‘Light’ Items 2220 1682

‘Heavy’ Items 211 1121

Scxentlfic wntmg

--

Subpcts Non-subpts

‘Llght’ items 720 629

‘Heavy’ Items 447 1140

Informal speech

Sublects Non-sublecls

‘Light’ Items 2064 1169

‘Heavy’ Items 148 811

Formal spoken and wntten Enghsh

SubJects Non-subjects

‘Llght’ items 1745 1273

‘Heavy’ Items 343 1238

Now that more complex structures have been exammed, rt IS clear that our hypothesis, whrch had already received support from the data m table 1,ISamply confumed by the tables above

NOUN-PHRASE

4

TYPES

291

IN CLAUSE-STRUCTURE

TEXTUAL ANALYSIS

Table 12 contams a textual analyas of the figures m the two bottom rows of table 9, thus enabhng us to examme then drstnbution for each of the 4 text-groups separately We now see that the situation we found in table 9 1s exactly parallehed m table 12 for each of these The two columns on the left have already been commented on above. The other three show that ‘heavy’ noun-phrase types are much less hkely to occur as exponents of sublect than as exponents of other functions m the clause Independently of therr function m the clause, they would also seem to be more frequent m Screntrfrc wrrtmg and m formal language than m Fictron and mformal speech TABLE 12 ‘Llght’ Pronouns/ names

‘Heavy’

f Determlne* + head

Xouns premodlfled by

I

ad]

B 8 a B 2

s 8 ,7

by



prep phrase

Nouns other\r,se pre- or

Total

postnodlfled

Light f&Ion

1943

277

60

59

92

Sclentlfic wntmg

459 1941

26 1 123

104 37

168 49

175 62

1167 2212

Informal speech Formal spoken and wntten

i:31

1470

267

02

134

127

2088

Light fiction

754

Scientific wntmg Informal speech Formal spoken

163 677 599

920 466 492

363 209 249

324 327 235

434 524 327

2803 1769 1980

674

310

436

492

2511

Enghsh

2

Nouns postmod]fled

and wntten Enghsh

5

FINAL

REM 4RKS

Obvrously v’e cannot conchrde from what precedes that heavy’ noun-phrase t ‘pes do not occur as subjects They do, and by no means infreqr mtly.

292

F G. A. M. AARTS

Consider for example The correspondlugsharp, but not mathematwally drscontmuous,changes predrctedby theory for a large but fmlte assembly (coutammg perhaps lOBa molecules) are, from an experimental pomt of stew, mdlstmgmshable from those that would be expected for an ‘mfmlte’ assembly, 3)

,.

@a

Instances of more particular mterest, mcludmg especially those of which detads have become avadable since the pubhcatlon of the most recent of the aforementioned reviews, are dlscussed m the followmg text (8a.4) Well, I think people who are gomg to create anythmg new to develop a new Idea, which IS not on the present tram lmes of commumcation, have got to do that (5b 2) A daff&l that Mey had plucked from the flower-banks before commg on board and had pganted m the ]om between the two rear seats, now the sole survivor of the reception left behmd them, jiggled its soundless bell (6 4) A froth whuzh depends upon the pnvate and Inward operation of the Spmt, whether m the form of mtellectual apprehension, emottonal expenence. or dreams and vIslons, ~111not be the genume work of the Splnt of (;od (8ab 2)

It is mterestmg to note that all these examples, with one excepItron, are from wntten texts In the spoken texts such ‘heavy’ structures are extremely rare There is evidence m thrs corpus to show that the language posse~es certam devices to cope wrth ‘heavy’ structures. Pngve (1960, 1961) has pointed out that among these is the use of discon tinuous constrtuents, as in the following cases. Several mam types of homolybc arylatlon reactions are known, which feature mteractmn between aryl radicals and aromatic substrates, and which can be generally expressed by equation (I) (8a 4) . condltlons can be found under whch they are not a compbcatmg m the mterpretatlon of quantitative results White corpusclea resembhng lymphocytes and pelymorphonuclear occur, produced by lymphold tissue m the ludneys 813 elsewhere

Another device is to put the sub@ in sentence&ml in be-sentences containmg a complement or an adjunct:

cells

@a1)

positron

NOUN-PHRASE

TYPES

IN CLAUSE-STRUCTURE

Attached to thrs base ISa serms of mcomplete cartdagmous boxes surroundmg (6d. 1) the bram and organs of specml sense

At theend of the resprratory tube rs a senes of veiar tentacles, correspondmg

exactly m posrtron to those of amphroxus, and servmg to separate the mouth an3 owophagus from the resprratory tube whrle the lamprey IS feedmg (8a 1)

The followmg repetrtrvedevrce occurs m a spoken example Whathe wdz,askmg for was that the Opposrtron. that Her Malesty’s Opposrtron, whrch has a very Important part to play m our whole constrtutronal structure, that Her Malesty’s Opposrtron should be kept mformed by Her Male&y’s Government of the mam and most crucral developments m defence (5b 16)

These problems present some very interestmg questrons, but to examme these would be beyond the scope of this paper As to our hypothesis, there can be no doubt about rts v&&ty There ISoverwhelming evidence m our corpus to Justify the conclusronthat nounphrase types are not randomly &stnbuted over the Enghsh clause, but that there is a marked assocratron between then structural make-up and their functronal role. There ISless evrdence to support the styli&m part of our hypotheses, whrch probably reqmres the investigation of more material. Kathlieke Umversitezt, Instztmt Enpls-Awskaans, B@evekis~@ 72, Ny+qm Tke Netherlands

REFERENCES

,

YNGVE, Victor H l%O ‘A model and an hypothws for language structure’, Procecdrngs of the Amsrrcan Phrlosophtcd Socrcty 104, 444446 YNGVE,Victor H I%1 ‘The depth hypothese’, Proccedrngs of Symposra an Ap+ed Mathmnatss, XII, Structure of Language and rts Mathematrcal Aspects, 130438

,