Document retrieval using a serial bit string search

DOCUMENT RETRIEVAL USING A SERIAL BIT STRING SEARCH ALAN F. HARDING,MICHAELF. LYNCHand PETER WILLETT” Department of Information Studies, University of...

Download PDF

759KB Sizes 0 Downloads 40 Views

Report

PDF Reader
Full Text

DOCUMENT RETRIEVAL USING A SERIAL BIT STRING SEARCH ALAN F. HARDING,MICHAELF. LYNCHand PETER WILLETT” Department of Information Studies, University of Sheffield, Western Bank. Sheffield SIO ?TN, England

Abstract-An

experimental

organisation.

Documents

best match

retrieval

system

is described

based

on the serial

file

and queries are characterised by fixed length bit strings and the time-consuming character-by-character term match is preceeded by a bit string search to eliminate large numbers of documents which cannot possibly satisfy the query. Two methods, one fully automatic and one partially manual in character, are described for the generation of such bit string characterisations. Retrieval experiments with a large document test collection show that the two-level search can increase substantially the efficiency of serial searching while maintaining retrieval effectiveness, and that a single-level search based only upon the bit strings results in only a small decrease in effectiveness in some cases.

I. INTRODUCTION

The great majority of current online bibliographic retrieval systems are based on the inverted file organisation. Although this provides a rapid response to Boolean search statements, it entails large computational overheads in the extensive sorting operations involved in the generation and updating of the indexes, the volume of disc space needed for their storage, and the complex software required to access the various files. Because of these overheads, there is now a growing body of interest in the use of serial, or direct, files which do not require indexes but which, unlike conventional batch SD1 services, could provide a sufficiently fast response for interactive retrieval. Two main approaches have been suggested for increasing the speed of serial searching. The first of these involves the use of special purpose hardware as reviewed by HOLLAAR[~,21. An alternative, software-based approach has been described by HICKEY[3] and by DUNN et d.[4] in which a fast initial search is used to eliminate all but a small percentage of the records in a file. This initial search is based on a fixed-length bit string which is associated with each of the records in the file, and which is matched against a comparable query bit string: only those few records which match the query undergo the computationally demanding character-by-character comparison for a match on the actual query terms. Implicit in this latter approach is a means of mapping the vocabulary used for indexing the items in the file into a dedicated bit string of reasonable length: we term this approach Llocubulmry reduction. HICKEY[~]and DUNN et a/.[41 used a superimposed coding technique due to HARRISON[S,61 in which digrams and trigrams chosen from the words in documents are mapped into the bit string. A similar two-level search is used in a chemical context as exemplified by the serial file used for interactive searching of the 5 million chemical compounds in the Chemical Abstracts Service Registry Service[7]; in this, the bit string is used to denote the presence or absence of a limited number of substructural fragments which have been carefully selected from the well nigh infinite range of possible substructures[S]. An obvious means of reducing the variety of word types encountered in natural language data bases is the use of short character strings, and there have been many reports of information retrieval systems based on such strings[9-161. All of these reports have assumed the use of exact match or partial match systems in which the query fragments are combined for search using the well known Boolean operations of conjunction, disjunction and union. Increasingly, however, information retrieval research has moved towards the use of best match searching in which the documents comprising a file are individually matched against a list of query terms, and ranked in decreasing order of some similarity or distance function. BURNETTet *Author to whom correspondence should be addressed I

.A. F. H
7

r1l.[l7] and WII I tagI [I& 191 have described best match experiments document test collection involving a range of methods for vocabulary

with the Cranfield 1400 reduction, including the

use of fixed length and variable length substrings, and of truncation and hash coding prvcedures. It was found that sets of a few hundreds of certain of these descriptor types resulted in a level of retrieval

effectiveness

comparable

with that obtained

from

use of the complete

word vocabulary, despite the great disparity in vocabulary size. This paper describes a novel means of selecting small sets of discriminating

textual

substrings for the representation of document content, and reports retrieval experiments this, and other methods for vocabulary reduction, with a large document test collection. 2. COI.OMBOand chemical

IDENI‘IFI(‘4

RLISH[%]

element

[‘ION

OF

DISCKIMIN

ATING

SUBS

comparisons

a KLIC Each entry

index

of the index

terms or title

shows a word indexed

data base, would

would be needed but would be most

unlikely to retrieve any other terms incorrectly. This simple procedure suggested here which involves three main stages. Firstly,

if used in place of the

in a serial search of a textual

reduce search costs because fewer character

produced.

I KINGS

have suggested that a string such as -YHD-.

name MOLYBDENUM

using

words

idea forms

the basis for

in the document

at one of its characters

collection

the is

other than the last. and

also gives the frequency of occurrence of the word; an extract from such an index is shown in Fig. I with the index letter shown by arrows above and below the column. A sorted word list is also used to monitor

the process.

In the second stage, the words from the least frequent characters

containing each character are examined in turn. working of the alphabet since discriminating strings are likely to

include characters of lower frequency. Words of high frequency are considered together with variants on the same stem or root. The substrings of the word or stem which include the character

on which

ROCKET

and RETROROCKETS,

the KLIC

index

is currently

arranged

are examined:

the cubstrings would include ROCK,

for

OCKE,

the words

CKET.

OCK,

CKE, and KET. Certain of these occur in other stems: thus OCK is seen to occur in BLOCKAGE at the head of the list in Fig. I, as well as in SHOCK, as may be found by looking up the fragment

OCK in the KLIC

be the shortest

discriminant

general, frequency

for

high frequency

index. The substring CKE is found, for this vocabulary,

substring, terms.

terms, which are not likely

and is selected

the substring

to represent

uniquely

identifies

to be used very frequently

this group

of terms.

the stem, but for

as query terms[Z!l]

Fig. I. Use of a KLIC index for the identification of shortr\l discriminating wbstrings. Boxec enclos group of word? sharing a common tuch substring.

to In low

and thus

Document

retrieval

using a serial bit string search

3

need not be delineated in such detail, the requirement for uniqueness is relaxed, so that a group of unrelated stems may be conflated: thus the substring VERS occurs in REVERSAL, REVERSED, etc. as well as in the unrelated terms VERSES and PERVERSE. The results of this stage are indicated by the substrings enclosed in boxes in the KLIC index of Fig. I. The final stage involves the marking in the dictionary of words from which shortest discriminant strings have been derived so as to avoid multiple strings being erroneously selected for a single stem. The second and third stages of the procedure are then repeated until words of frequency greater than some low limit in the list are exhausted. There is clearly much latitude in the procedure, and movement up and down the scale of vocabulary size is clearly possible, depending upon the threshold frequency which is chosen. The method is dependent in part upon the particular vocabulary under consideration and, unlike our previous approaches to vocabulary reduction[ 17, 181, is primarily manual in operation, invoiving as it does subjective judgements about what words or stems should be conflated. In this respect, the approach is similar to the work involved in the development of a stemming algorithm for retrieval purposes[22], the use of which, it may be noted, also results in a reduction in vocabulary size owing to the conflation of words which share a common root or stem, and of the procedure described by MULLIN for the correction of character recognition systems [23]. Examples of discriminating substrings and of some of the words to which they were assigned in the experiments reported below are shown in Fig. 2. Once all of the discriminating text fragments have been identified, a document bit string is created by matching each of the document terms in turn against the set of word fragments, and setting that bit which corresponds to the longest matching fragment for the term. For comparison in the experiments reported below, we have also used a method for vocabulary reduction based on division hashing which performed well in earlier work[l8]. In this, some convenient fixed length prefix string-four characters in our experiments-of each term, space filled if necessary, is treated as a binary integer and divided by some specified number, d, where the document and query bit strings are to be of length d bits. The division results in a remainder, r, in the range 0 to d - I which is used to set the r + I’th bit in the bit string representing the document or query. Since the experiments were carried out on an ICL 1906s computer which has a 24-bit word length, the divisors were all one less than multiples of 24.

- HANG

-

- IlARG -

CHANGE

CIIARGE

CHANGED

CHARGED

CHANGES

CIIARGING

CHANGING

DISCHARGED

EXCHANGE EXCHANGED UNCHANGED

- VERT

-

- VISC

-

CONVERTER

INVISCID

INVERTED

NONVISCOUS

SEMIVERTEX

VISCID

VERTEX

VISCOELASTIC

VERTICAL

VISCOELASTICITY

VERTOL

VISCOPLRSTIC VICOSITY v1sc0us

Fig. 2. Discriminating

substrings

and the terms to which

they were assigned

in the Vaswani

data base.

1

A. F. H IKI)I\O t’l trl.

3, EXPERlMENl.AL

DETAILS

The experiments used the Vaswani/National Physical Laboratory document collection which is a large test set containing II429 documents and 93 queries for which relevance judgements are available. The version of the collection employed here had the document\ automatically indexed from titles and abstracts using an extensive stopword list and a suffix stripping algorithm with similar procedures being applied to the query statements: in all. a total of 71 I9 distinct stem types were identified. A serial search using the bit strings involved the following four steps: (i) matching of the query bit string against each of the document bit strings in turn to identify those documents having some minimal number of bits in common with the query; (ii) matching of the list of query terms against each of the lists of document terms corresponding to the documents passing the bit string match; (iii) ranking of the documents in decreasing order of some matching function based on the common terms; (iv) application of a cut-off to retrieve some fixed number of documents. The efficiency of the bit strings in eliminating documents from the term match is described in the results below by the .tcreenout, which is the percentage of the file eliminated by the bit string search. In exact match, or partial match. searching, the full document record would need to be retrieved and inspected only if there was an exact, or an inclusive, match between the document and query bit strings. In the case of best match searching, however, all documents must be inspected which have at least one bit set corresponding to a query bit since this implies the possibility of at !east one term matching; accordingly, the screenout which may be expected is very much lower than in the case of exact or partial match retrieval. However, a simple trade-off may be effected between search efficiency and search effectiveness by specifying a minimal number of bit correspondences for the term match to take place: in the experiments reported below, a threshold. t, of I, 3, or 3 bits was used. When t = I, the effectiveness will be exactly the same as that of the normal term match but, as the threshold is raised, the effectiveness of recall-oriented searches in particular may suffer owing to the need to include in the output documents having very few terms in common with the query. In an operational implementation, the thresholds used would depend upon the length of a particular query. For example, a natural language need statement might well yield twenty or more search terms and a high threshold would be required to limit the term matching to an acceptable amount. For the Vaswani collection, however. the queries are fairly short, having a mean of 6.6 terms per query, and the three chosen thresholds are accordingly low. It should be noted that although the use of I ‘. I will increase the screenout. and hence decrease the amount of term matching and the overall elapsed time for the search, it is also likely to increase the time required for the first level, bit string search. This increase arises from the need to shift the computer words arising from ANDing the document and query bit strings through a register to determine the exact number of matching bits: in the case of t = I, a machine branch may be executed as soon as a matching computer word is identified. Experiments were also carried out in which the bit strings alone were used as the basis for the calculation of the matching coefficient between a document and the query. This one-level search is a much more stringent test of the ability of the bit strings to characterise the content of the documents and queries, and is also still more efficient in operation since no term matching is required at all. The effectiveness of these searches was characterised by the effectiveness function, E[3_4], which is defined as I_(I+h’)PR h’P+R

’

where h is a user-defined parameter which reflects the importance attached by the user to precision (P) and to recall (R), R and P being calculated on the basis of some fixed number of retrieved documents. The figures reported below refer to the mean E value when averaged over the entire set of 93 queries using cut-offs of 15 and 65 documents, these corresponding to a precision-oriented search for which b was set to 0.5, and to a recall-oriented search for which h was set to 2.0. It should be noted that the lower the E value, the better the retrieval.

Document

retrieval

using a serial bit string search

.F

A range of methods has been suggested for determining the correlation between document and query representatives. The experiments here used the following three types of matching function: (i) simple coordination level, i.e. the number of common terms; (ii) Dice coeficient in which the number of common terms is normalised by the sum of the lengths of the document and query term lists; (iii) inverse document frequency weighting in which a match on a term of collection frequency f results in a contribution of logYV;,Jf) to the overall match value[21]. 4. RESULTS

AND

DISCUSSION

The results for the two-level searches, in which the bit strings are used to limit the amount of term matching carried out, and for the one-level searches, in which the bit strings alone are used for determining the degree of similarity between a document and the query, are given in Tables 1 and 2. The figures correspond to the use of a set of 719 discriminating substrings, of hashing with a comparable divisor, and of the full set of 7119 word stems. Table t details the screenout obtained using thresholds of 1, 2, and 3 bits (or words in the case of the term match). It will be seen that both methods of vocabulary reduction result in a high level of screenout from the bit string search. and this is so even when the minimal threshold of 1= 1 is used. The screenout for the stems is that obtainable from a best match search in which an inverted file is available. In such a case, for I = 1, the lists from the inverted file corresponding to the query terms may be ORed together to yield a list of those documents which have at least one term in common with the query and thus will have a non-zero value for any of the matching functions above; in the case of 1 = 2 (or 3), the screenout figures

Table

I. Screenout

using discriminating

Discriminating

Table

2. Retrieval

effectiveness

substrings. division hashing of 1. 1 or 3 matches

substring

in a one-level

search

and the full term lists, and with a threshold,

t=i

t=2

t=3

62.8

88.4

96.8

using discriminating substrings, division hashing, and the

1,

full

term lists

Dice

Coordination b = 0.i

Discriminatlnu

Division

T.Z?iZlll

SubStrinq

hashinq

b = 2.0

b = 0.5

IDF b = 2.0

b = 0.5

b = 2.0

0.81

0.76

0.87

0.84

0.79

0.75

0.85

(?.80

0.88

0.85

0.84

0.79

0.80

0.74

0.88

0.83

0.80

0.74

A. F. H \KI)IV(,

h

et rd.

correspond to an inverted file search in which pairs (or triplets) of term lists are ANDed together and the set of resultant lists ORed together. The documents in the resulting list may then be inspected one at a time to identify the best matches[3] in much the same way as the documents passing the bit string comparison are inspected in the work reported here. The one-level searches were carried out without any threshold w that all documents with at least one bit in common with the query entered into the ranking. The results obtained in Table 2 ,tnd ’ it will be seen that the discriminating little different

to that provided

sub\trings

by the full stem vocabulary,

which is about ten times as large.

The performance of the division hashing is noticeably poorer. A series of experiments was carried out using bit strings

based on division

determine

string

effectiveness

the effect

of variations

of searching[l9].

in the

The results

length

are shown

yield a level of effectiveness

of the bit

of these experiments.

hashing

on the efficiency using

strings

to and

containing

between 239 and 959 bits, are shown in Tables 3 and 4. It will be seen that, at first, the screenout rises rapidly with an increase in the length of the bit string but that the rate of increase then slows and the screenout tends to the screenout obtainable

from an inverted file search. From the results in

Table 3, it is clear that increase\ in the length of the bit string above some number of bits will result in only a marginal increase in screenout, a result which is in line with work reported the context

of substructural

searches of machine-readable

chemical

structure

elsewhere

in

files [26].

The effectiveness figures in Table 4 are ba\ed on one-level searches using coordination level matching and exhibit ;I variation comparable to the screenout result\ in Table 3. As the length of the bit string is increased,

the effectiveness

rises, i.e. E decreases. slowly.

This is again in

agreement with our earlier experiments using the Cranfield test collection[l7. IX] and the work reported by CR,4WFORD[?7] where the size of the vocabulary W;I~ varied by the elimination of terms with low discriminatory power. In the discussion w far, it has been assumed that the reduced vocabularies the basis for a comparison

of document

and query

will be used a\

bit strings using conventional

computer

Document retrieval using a ~i;d

bit \tring search

7

hardware. However, devetopments in computer design suggest an attractive alternative means of using reduced vocabularies. Associative parallel processors (APP)[28] search data by content rather than by address, as in a conventional index-based retrieval system. Content-based access is achieved by means of a storage device in which corresponding positions in each memory location can be interro~ted in parallel, and a correct match causes the address of the corresponding location to be indicated to the hardware. This provides for a very fast search, although costs and f~~brication methods restrict the sizes of such processors to some th~~usandsof locations, so that the full advantages may not be achieved unless very fast means are provided for refitlin~ the device from large scale backing storage media. A typical design involves sequential operations on bit slices and hence the processor has a width which is the number of fields in each record, and a depth which is the number of documents which can be processed at one time. It is clear that the use of a restricted vocabulary whereby the content of a document or query is represented in a bit string of fixed and limited length provides one means by which the economic viability of such devices could be enhanced substantially. LEE and SCHVECRAF[?~] have described a device involving an APP placed between a disc drive and the central processor which permits “‘on-the-fly” retrieval at search rates of some megabytes per second from a document file based on bit strings similar to those described here ~aithough in their design, the bit strings are suggested as a means for the generation of a hash code rather than as document characteris~~tions in their own right). These developments raise the possibility of considering fast mechanisms for serial search. A processing rate of 1 Mbyte, i.e, 8 Mbit, per set using a reduced vocabulary containing 1000 terms would correspond to a search rate of some 8000 documents per second. Bit strings that matched the query would trigger an access to backing storage for the retrieval of the full document record for term matching and display to a user. it is likely that the first such document would be displayed in a period comparable with the delay expected in current online retrieval systems and, subsequent to this first retrieval, thinking time would probably far outweigh the search time of the processor. Thus although the total elapsed time for the search might be greater than in current interactive retrieval systems, the delay is unlikely to seriously incommode a user. ft may also be noted that, in the case of a sit&e user, accesses to backing storage would be very much more under control in such a system since the accesses would take place in the same sequence as the file, thus reducing in large part the seek time associated with moving from track to track; similar con~ments would apply in a multi-user environment if searching wits carried out by continuous scanning of the bit strings, rather than by initiating the scan each time a query was presented. 5. CONCLUSIONS

A two-level search procedure has been described for the searching of files of documents. The first level consists of a rapid comparison of fixed length bit strings which are generated from the query and document term lists. Only those documents having some minimal number of bit matches with the query undergo the second-level term list comparison. Retrieval experiments with the Vaswani test collection show that the procedure can considerably increase the eficiency of serial searching while maint~~iningthe effectiveness of a full search. Experiments with a one-level search in which the bit strings alone are used as the basis for the calculation of a matching function between documents and queries. show that a novel approach to the identj~cati~~n of discriminating character strings provides an extremely effective means of deriving redundant representations of document content. There is a distinct trade-off between the length of the bit string used and the efficiency and effectiveness of retrieval. Although performance increases with increasing bit string length, improvements above a certain point are likely to be gained only at the expense of a large increase in length. The exact point at which the marginal increase in performance is outweighed by increased storage and search times will depend, inter ufiu, upon the type of reduced vocabulary employed, the size of the file which is to be searched, and computer hardware constraints. The document and query bit strings may be implemented most efficiently in novel computer architectures based (3%associative processor devices. AcknorfetlX~mmfs-Our thanks are due to Dr. P. K. ‘I’. Vaswani for the d~~cumentcollection, and to British Library Research and Development Department for funding during the early part of this work.

REFt:REk(‘I:S

I \ W. Unconventional sci. 7-rc~htd lY79. 14. I I’).

[I ] 1,. A. HOI

computer

architecture\

[2] L. A. HoI.I.~\ {K. Text retrieval computers. (‘or,lputrr [!]

T. HIC Kf:‘l. Searching linear file\ on-line.

[4]

R. G. DL~v\. W. FIS.\\I(X Abstracta

J.

[.S] M. C. H \KKISON. Implementation Softwdrr

Inforttt.

Chrm.

Pn~liw

ctnd

of text

Cotrrput.

.!+i.

Cotntt~un.

Ggnature\

for

1971. 14. 777.

ACM

accelerating

string

searching.

of substance information

from Chemical

Abstract\

de@

1. M. F. LOACH and M. J.

[IO] C. E. GOHI k. A frwtc\t

retrie\;ll

\y\tsm

SNf I I

.4CM

Cmt~tt~rrtl.

structure

searching.

I. The w-eens.

J.

. An information-theoretic COI~I~)III. .1. IY?!. IX. 18.

procesGng in 3 retrospective Infornt.

;I\ language elernentj.

document

Proc,. ,Mnncqcwtrv1

WII I I.\MS and M. T. KH II I IGHI. Document

approach to text

I Y74. 17. 34s.

using hash code\.

and H. S. HI..~I~s. (luerl

uses word fragment\

for chemical

I Y7S, 15. 137.

in direct ac‘cej\ $1 jtemj.

[I I] E. J. SCHII~GK.IF

[I?] P. W.

based on Chemical

19X0. 3. IO.

[9] I. J. BAKION. S. E. CKI \\I warching

warch \y$trm

1977, 17, 212.

19%. 12. 3s.

E.xprifwf~

[8] .4. F~I.~xv.~‘L and 1,. HOI)I \;. ‘An efficient Chcrn. Inforttt.

&i.

of the Aubstring te$t by hashing.

[7] N. A. F~RHFR and M. P. O‘H IR,A. .4 new wurce Service. Dnt&rr

RcL’. lnfortt~

1977. I. 53.

COI~I~NI.

and K. C. T.41. The practicality

.4nn.

retrieval.

1979. 12. 30.

ReL.im

and A. Z+LIOR\. A chemical sub\tructurr

Index Nomenclature.

[6] A. L. Ttf-\Kf’

Ott-linr

for information

retrieval

retrieval

\qztem th:rt

1976. 12. 183.

using ;I \ub\tring

inde.1.

Cottlptct.J. 1977.

20. 157. Qua4

1131 T. ~)t Htt:~. Itrfortti.

Ptw.

comprehenGon

1141 C. S. RoHwIS.

of natural

language simulated

by mean\

of information

trace5.

1979. 15. X9.

,Mtrntrgfwifvrf

codc$. Ptw.

IEEE

IY7Y. 67,

for processing partial match queries using word fragments.

Inforttl.

Systems

Partial-match

retrieval

via the method of wperimpased

1623.

[ 151 V. S. AI (I if?. Algorithms 19x0, 5. 323.

[ 161 D. KKOIV and G. W.41 ( H. A graph structured text field index based on word fragments. Itlfortn. [I71 J. E.

ELKUF-I

experiments

1.

D. COOPF.K. M. F. LEACH, P. W1t.t.t I I and M. W~(,H~.KI E\,

using indexing

fronts of indexing [IX]

term\.

P. WII I I.1 I. Document truncation.

cocabularie~

J.

of varying

generation

experiments

using indexing

vocabularies

[?I]

(‘llfwlic~tr/

Dof~~rttlrtltiiticttl

K. Q\K(

li Jo\~\.

noc,rtttic,tit[lfi,n

,Mdtirrf~

l’tIK(

In~f~//;,qfwf~

J.

.l. Inforttldc,\

in computer-bajed

lY7Y. 3. 3. retrieval

\ystemj.

J.

of term \pecificitb

and it5 application

in retrieval.

J.

Infortttr~fion

I 1I I. The effect of wren Inforttf.

(271 R. W. CK\MI OKI). Nr,qltirr

Rrrriru~I.

reagnition

debicej.

Butterworth.

London

The nearest neighbour

,4CM SIGIR

Chr.

Sci.

of wme conflation

algorithm9

lEEE

!‘rcm\.

Pcrtrrrtr

Atlct/~.~ic

347.

RIJSHERGEN.

I’. %‘;I

C’ottlpur.

using unreliable

I98 I. PAMI-3,

using upperbound\.

and P. WII 1.1 I I, A compariwn

Itrforttt. !Gi. 1081, 3. 177.

indexing

[75] .A F. SVI ,j ION and C. J. \’ \\ [Ih]

interpretation

I . H. D. T\KK\I

retrieval.

[?4] C. J. \. \C RtJjHtK(;F\. algorithm

fragments

sire. II. Hashing.

1979. 35. 296.

I YhY. 9. 4;.

A \tati\tical

[13] J. K. ML I I IN. Reliable trntl

retrieval

1972. 28. I I.

[E] M 1.1NNo\. D. S. for information

of word

of varying

J. Dftc.utttpttlntiott

digram and trigram encoding of index term\.

D. S. (‘III ()i!~w and J. E. RL’SH. Uw

Document

spmhol~ assigned to the

l97Y. 35. 197.

Dfif~ut7ifvtIc~tion

retrieval

size. I. Variety

[ 191 P. WII.I I I I. The effect of atrrihute variety on retrieval performance. [NJ

Prw.

1981. 17. 363.

Mrrniipwf’tll

Forum

(lY7Y).

problem in information

retrieval.

An

19x1. 16. X3.

\et size on retrieval

from chemical

jubstructure

search systems. 1.

lY7Y. 19. 253. Dictionf~ry

Scientific

Cfmtrucfion.

Report

ISR-22.

Cornell

Universit)

(1974). [X]

K. J. THI KHf K and 1,. D. W\I I). Associative

[29] R. M. I.t+, and E. J. SCHl’tGK:\f-. compresGon.

Irrjortttcttion

Rrfrirwl

parallel processors.

An associative Rrstwrch

Computing

file store using fragments

(Edited

Surrux.\

1975. 7, ?I.(.

for run-time

by R. N. ODDY. S. E. Rot+,krso\.

RIJSHEKGFX and P. \I!. WII I I:I\IS). pp. 280-295. Butterworth.

London

(IYXI).

indexing

and

C. J. \ ,I\

Document retrieval using a serial bit string search

Document retrieval using a serial bit string search

Recommend Documents