ht.
Libr.
Rev.
(1984) 16, 299-307
Efficiency of PRECIS Role Operators M.
MAHAPATRA*
and
S. C. BISWAS
The efficiency of the role operators of PRECIS may be studied from two viewpoints-quantitative as well as qualitative. The frequency of appearance ofdifferent role operators in an input string may be taken as a measure ofsuch quantitative parameters ofefficiency. The scope of this paper lies in measuring the efficiency of role operators through their frequency of appearances in input strings. Most of the earlier research on PRECIS focused either on its general descriptions or its application to various subjects, media, languages, institutions, countries and to any regional/national/international information exchange network. No work has so far been reported which makes a quantitative evaluation of the system through its role operators except some general evaluative studies against different subject fields.1,2
MATERIALS
AND METHODS
Two hundred abstracts each related to the subject fields of taxation, genetic psychology and Shakespearian drama were recorded in slips from secondary sources like Journal of Economic Literature, Psychological Abstracts and Abstracts of English Studies respectively, published from 1970 to 1980. The PRECIS input string to each of the 600 items was based on the principles and methods recommended in the Manual.3 The role operators assigned to various concepts were then counted, irrespective of their affiliation with and position among other role operators in the input string. A frequency distribution of the role operators was then made with the help of tally-marks and the percentages of the frequency distributions were calculated. *M.
Mahapatra,
M-l,
Teachers’
Quarters,
Tarabag,
Burdwan-713104,
West Bengal,
India.
’ D. W. Austin (1974). The development of PRECIS: a theoretical and technical history. J. Documn. 30, 477102. 2 K. G. B. Bakewell (1975). The PRECIS indexing system. Indexer 9, 16&66. 3 D. W. Austin ( 1974). PRECIS: a Manual ofconcept Analysis and Subject Indexing. London: Council of the British National Bibliography. 0020- 7837/84/030299
+ 09 $03.00/O
0
1984 Academic
Press Inc.
(London)
Limited
300
M.
MAHAPATRA
AND TABLE
S.
C.
BISWAS
I
Frequency of appearance of major categories of role operators Category
Frequency appearance
Main line Interposed Differencing Connectives Theme interlinks Total
of
Percentage
Cumulative percentage
1630 1537 235 694 84
39.00 36.77 5.63 16.60 2.00
39.00 75.77 81.40 98.00 100~00
4180
100~00
OBSERVATIONS
AND
DISCUSSIONS
Major Categories of Role Operators Altogether 4180 concepts were specified by the role operators out of which the highest position was occupied by the mainline operators closely followed by interposed operators (Table I). But analysis of the frequency distribution of major categories of role operators in different subject fields (Table II) showed that the main line operators occupied the topmost position in taxation and genetic psychology, whereas the topmost position in Shakespearian drama was occupied by interposed operators. Comparative analysis of the per-
TABLE
II
Frequency of appearance of major categories of role operators in different subjects Category
Taxation
Fr* Main line Interposed Differencing Connectives Theme interlinks
535 304 121 275 11
Total *Fr = Frequency
1246 of appearance.
Genetic psychology
Shakespearian drama
%
Fr
%
Fr
%
42.93 24.37 9.78 22.05 0.87
503 478 83 315 22
35.90 34.12 5.93 22.48 1.57
592 755 31 104 51
38.62 49.25 2.02 6.78 3.33
100~00
1401
100~00
1533
100~00
EFFICIENCY
OF
PRECIS
ROLE
OPERATORS
301
centages showed that the rankings of major categories of role operators were exactly the same in taxation and genetic psychology, but in Shakespearian drama the ranking was a bit different. The first two ranked categories (i.e. main line and interposed) in Shakespearian drama together covered almost 90% of the total usage, whereas in taxation and genetic psychology they covered about 70% of the total usage. Connectives, however, occupied the third position in all the three subjects. But on average, the main line operators occupied the highest position (x= 39.15*2.89%) in comparison with the interposed operators (+?= 35.91+ 10.23%), which confirms Austin’s statement that, “The main line operators form the backbone of the syntactical system of PRECIS”. l The significant use of connectives might indicate that even ifeach concept in PRECIS is placed in the index entries according to the principle of context dependency, there might have appeared some such concepts which have led to ambiguity. Hence, the efficiency of connectives is proved. The low appearance of the theme interlinks, on the other hand, might be assigned to the fact that journal literature seldom deals with more than one separate theme within a single document. Thus, for the indexing ofjournal articles, this category ofrole operators has very little use. The high use-value of interposed operators in Shakespearian drama in comparison with taxation and genetic psychology, is an important point towards the relative efficiency of this category in humanities subjects. It might be said that in literary subjects, like Shakespearian drama, authors aim to specify the key concepts more than in the other two subjects. The lesser use of connectives, on the other hand, in Shakespearian drama would indicate that in literary subjects concepts are much more context-dependent than the scientific subjects. For subjects like taxation and genetic psychology, connectives are highly efficient for establishing the context-dependency of terms. But terms helping to specify concepts at the semantic level (i.e., differencing words) are less important in Shakespearian drama, than in taxation and genetic psychology. Thus, the technique of differencing of compound terms is more useful for scientific subjects. Between taxation and genetic psychology, there is one difference regarding the use of interposed operators. In taxation, authors have less desire for specificity than in genetic psychology. This could be seen from the respective percentages of use of main line and interposed operators for both the subjects. In taxation, the use of interposed operators was nearly half of the use of
’ D. W. Austin
(1974).
PRECLS:
D Manual
., op. cit. p. 58
302
M.
MAHAPATRA
AND
S.
C.
BISWAS
main line operators, whereas in genetic psychology, have been used in almost the same proportions.
these two categories
Individual Tyjes of Role Operators
To study the frequency distribution of individual types of role operators, the existing five major categories have been regrouped into three, i.e., main line, interposed and “other”, the last one comprising of differencing, connectives and theme interlinks. Among the main line role operators, the operators 1, 2, 3 and 6 were used significantly; the remaining three operators had less than 1% usage (Table III). The role operator 1 was the most-used operator in the group, followed by 2,3 and 6, in that order. Among the interposed operators, p occupied the highest position. No other operator could be able to reach a two-digit value (Table IV). Incidentally, role operator r did not appear even once in this study. Among the “other” category, three role operators, viz., j, k, and n could not be used at all (Table V). Of the remaining nine role operators, the first position was held by v, followed by w, i, d, and h, in that order. An interesting trend could be marked among their usevalues, that a move from first rank to second rank and from second rank to third rank, respectively, reduced the use-value by almost half. Within differencing operators, the direct differencing operators (both lead and non-lead) had a comparatively high use-value than the indirect ones, whose frequency of appearance was almost nil. Main Line Operators in Different Subjects. Analysis of frequency distribution of individual types of role operators in three different subjects was made. Among the main line operators (Table VI), the operator 1 TABLE
III
Frequency distribution for individual main line operators Role operator
Frequency of appearance
0 1 2 3 4 5 6
15 604 563 214 41 25 168
Total
1630
o/0 from total frequency 0.36 14.45 13.47 5.12 0.99 0.60 4.02
o/0 from group frequency 0.92
37.05 34.54 13.12 2.52 1.53 10.32 100~00
EFFICIENCY
OF PRECIS TABLE
ROLE
303
0PERATORS
IV
Frequency distribution of individual interposed operators Role operator
Frequency of appearance
o/0from total frequency
o/0 from group frequency
P 9 r s t g
812 178
19.42 4.26
52.83 11.58
157 89 301
3.75 2.12 7.21
10.22 5.79 19.58
Total
1537
100~00
topped the list in Shakespearian drama and 2 in both taxation and genetic psychology. In all the three subjects, role operators 1,2,3 and 6 were heavily used, which means that, these operators representing a key system, action, agent and target/form have the maximum probability of usage in all the subjects. The efficiency of these operators is relatively higher than the TABLE
Frequency distribution
V
of individual role operators other than main line and interposed operators
Role operator 11 i
Frequency of appearance
0/Ofrom total frequency
o/0 from group frequency
46 121
1.10 2.89
4.54 11.95 -
1
0.02
0.09
18 49 457 237 19 36 29
0.45 1.17 10.94 5.66 0.45 0.86 0.69
1.78 4.84 45.11 23.40 1.87 3.56 2.86
j
k
Total
1013
100~00
304
M.
MAHAPATRA
AND
S.
C.
BISWAS
TABLE VI Frequency distribution of main line operators in different subjects Role operator
Taxation
Genetic psychology Fr %
Shakespearian drama Fr %
Fr
%
1 2 3 4 5 6
8 107 270 105 23 19 3
1.49 20.00 50.46 19.65 4.29 3.55 0.56
7 137 235 87 15 4 18
1.39 27.24 46.73 17.29 2.98 0.79 3.58
360 58 22 3 2 147
60.8 1 9.79 3.72 0.52 0.33 24.83
Total
535
100~00
503
100~00
592
100~00
0
remaining three main line operators. For literary subjects, like Shakespearian drama, the authors mostly centre their study upon the key system. That is why, role operator 1 had a comparatively higher value than in other subjects. Thus literary subjects can be considered as heavily key system oriented. But, on the other hand, scientific subjects like taxation and genetic psychology, are heavily action-oriented. That is why, in both the subjects the role operator 2 was the highest-used main line operator. Further, this action must be initiated by some system acting as the agent and resulting in some effect upon some entity. The corresponding high value of role operator 3 denoting agent and operator 1 denoting key system confirmed this in both the subjects. An important feature of literary subjects is that, they are often prepared for different categories of readers and also in different forms of exposition. For example, Shakespearian dramas can be produced for school students as well as for adults. Again, they could be transformed in the form of fiction. Thus, it is very likely that the indexer might come across these facets while indexing documents on literary subjects. The high usage of operator 6 in Shakespearian drama supported this argument. Since journal articles in scientific subjects seldom have any scope for treating these two facets, the operator 6 has been used very little in other two subjects. In sociological subjects, authors often treat their subjects from different viewpoints, but the low frequency of appearance of operator 4 in taxation, somehow did not confirm this idea. In Shakespearian drama and genetic psychology, however, its use-value was lower than that of taxation. In literary subjects, the role operator denoting location has no importance at all.
EFFICIENCY
OF
PRECIS
ROLE
305
OPERATORS
TABLE VII Frequency distribution of interposed operators in dfferent subjects Role operator
Taxation Fr
Total
y.
Genetic psychology Fr “1,
Shakespearian drama Fr 7;
77 38
25.32 12.50
175 38
36.62 7.94
560 102
74.18 13.50
70 22 97
23.03 7.24 31.91
75 35 155
15.69 7.33 32.42
12 32 49
1.58 4.24 6.50
304
100~00
478
100~00
755
100~00
Interposed Operators in Different Subjects. Among the interposed role operators (Table VII) the operator p topped the rank list in Shakespearian drama and genetic psychology, whereas in taxation it ranked only second to the co-ordinate concept g. On average, p appeared to be the single most-used operator in the study. This might be due to the fact that, concepts denoting part/property are common to all subjects, which is true particularly for literary subjects like, Shakespearian drama. The role operator g denoting co-ordinate concepts, was the second most-used operator in this category. Concepts denoted by this operator actually belonged to either one of the main line operators (especially 1, 2 and 3) or to interposed operators (like p and q). This role operator was particularly useful for taxation and genetic psychology. Role operator q was another efficiently used operator in all the subjects. The high use-value of s with the simultaneous high use-value of 3 in taxation and genetic psychology has implied that most of the agentive terms, their function not being very clear, had to be preceded by some role defining terms. Hence, for scientific subjects, the efficiency of role definer s has been categorically emphasized. Dtfferencing Operators, Connectives and Theme Interlinks in Different Subjects. The frequency distribution of “other” types of role operators (Table VIII) in all the three subjects indicated two common features. First, operators v and w occupied the first and second position, respectively, in all the subjects. Second, the unused operators in all the subjects belonged to differencing category, i.e., operators j, k and n. On the other hand, differencing operators, like i and d, featured in all the three
306
M.
MAHAPATRA
AND
TABLE
S.
C.
BISWAS
VIII
Frequency distribution of diff erencing operators, connectivesand theme interlinks in different subjects Role operator h i j k m n : V W X
Y z Total
Taxation Fr
%
30 72 -
7.37 5.77
1
0.24
Genetic psychology Fr y. -16 3.80 47 11.19
Shakespearian drama Fr “/o 2
1.08
18 173 102 4 5 2
4.42 42.50 25.07 0.99 1.23 0.49
18 2 215 100 6 11 5
407
100~00
420
4.29 0.47 51.19 23.81 1.43 2.63 1.19
29 69 35 9 20 22
15.59 37.10 18.82 4.84 10.75 11.82
100~00
186
100~00
subjects, the third position being occupied by d in Shakespearian drama, and by i in taxation and genetic psychology. One more striking feature about the frequency distribution is that, role operators belonging to the differencing category were least efficiently used in ail the three subjects. Out of the total of eight operators in this group, only h, i and d had some significant use. Operators h and i, which specify non-lead and lead direct differences, respectively, to the focal concept, actually belong to the same category. Conceptually they belong to terms which directly limit the connotation of the focus. Their application is being effected by the indexers’ judgement on the users’ probable approach to the index. Among the unused ones, j (denoting salient difference) have not been used in this study deliberately. This operator is of value to an actual library situation. But k and m, specifying indirect differences could not be used much in these subjects. Of these, only m had appeared in a single case. Operator d denoting “date as difference” has appeared in all three subjects. Operator o seemed to be more efficient in genetic psychology. Between the connectives, the use ofv was almost double to that ofw in all the subjects. This indicated that when input strings are read from top to bottom, the degree of ambiguity is generally more than the situation
EFFICIENCY
OF PRECIS
ROLE
OPERATORS
307
when the same is being done from bottom to top. Therefore, it can be said that, the strength of syntactical relation between two terms may be different due to their relative context; this strength may be less when two terms are read from the broader to the narrower context and may be more when the same two terms are read from the narrower to the broader context. CONCLUSION
The above discussions lead to the conclusion that, as expected, the main line operators formed the backbone of the PRECIS indexing but with some exceptional cases in literary subjects, like Shakespearian drama. Literary subjects are more key system-oriented, whereas scientific subjects are action-oriented. Interposed operators can be used in all subjects with equal efficiency, especially p, q and g can be highly used. In scientific subjects the role definers can also be used efficiently. Among differencing operators only h and i can be used to some extent in scientific subjects. Otherwise, differencing operators in general have no significant use in indexing microdocuments. The most efficient among the connectives is the downward reading component. However, the connectives in general are of much importance for indexing purposes. It should be emphasized here that more research is needed in this regard dealing with different disciplines, so that a thorough understanding of the nature, use-value and efficiency of the different PRECIS role operators will be achieved in future. ACKNOWLEDGEMENT
We are thankful to the Authority of the University of Burdwan for approving the project on the application and efficiency of PRECIS role operators and to Mr P. Barua, Head, Department of Library and Information Science, the University of Burdwan, for encouragement and facilities. We are also thankful to Mr Martin Nail, Indexing and Dewey Classification Section, Bibliographic Services Division, the British Library, for his reply to all our enquiries on behalf of Mr Derek Austin, and his encouragement during our investigation.