lnternafional Journal of information
Management
(1990),
70 (17&l
77)
Research Notes introduction The
M.F. Lynch, Department of Information Studies, University of Sheffield, UK.
‘t,YSoN, G.M. AND LYNCI1, M.F. (1963). Chemical-biological activities. A computer-produced-express digest. Journal of Chemical Documentation, 3, pp. X1-85 COSSIIM, W.E., KKAKIWSKY, M.L. AND LYNCH, M.F. (1965). Advances in automatic chemical substructure searching techniques. Journal of Chemical Documentation, 5, pp. 33-35. ‘ARMITAC.E, ,.k. AND LYNCH, M.F. (1967). Articulation in the generation of subject indexes by computer. Journal of Documentation, 7, pp. 17&178. 3LYNC.H, M.F. AND PETRLE, J.H. (1973). A program suite for the production of articulated subject indexes. Computer Journal, 16, pp. 4651. ‘ARMITAGF, J.E. AND LYNCH, M.F. (1967). Automatic detection of structural similarities among chemical compounds. Journal of the Chemical Society (C), pp. 521-528. HARRISON, J.M. AND LYNCH, M.F. (1970). Computer analysis of chemical reactions for storage and retrieval. Journal of the Chemical Society (C), pp. 2082-2087. LYNCH,
M.F.,
NUNN,
P.R. AND
RADCLIFFE.
J.
(1978). Production of printed indexes of chemical reactions using Wiswesscr Line Notations. Journal of Chemical Information and Computer Sn’ences, 18, pp.9496. ‘LYNCH, M.F. AND WILE?-I, P. (1978). / The automatic detection of chemical reaction sites. Journal of Chemical Information and Computer Sciences, 18, pp. 154-159. h
ADAMSON,
G.W.,
CKtASEY,
S.E.,
EAKINS,
Analysis of structural characteristics of dhemicai compounds in a large computer-based file. Part 5. More detailed cyclic fragments. Journal of the Chemical Society (C), pp. 2071-2076 (and the four preceding papers in this series)
J.P.
176
AND
LYNCE~.
M.F.
(1973).
second
in
our
Research
Notes
series eschews the wide-angled approach and focuses instead on the work of an individual, Professor Michael Lynch of Sheffield University. It is exactly a quarter of a century since Mike joined the Department of Information Studies and a fitting moment to pay tribute to the pioneering quality of his research in the area of chemical and textual information retrieval. Peter Willett’s trace map gives a sense of the evolution and impact of Lynch’s research efforts over the years. At a time when the value of basic research is questioned in certain quarters, it is instructive to see how one man’s ideas have not only pushed forward the frontiers of knowledge, but translated with some elegance to the marketplace. B. Cronin Department
of Information
University
Science
of Strathclyde,
UK
Michael F. Lynch Professor Michael F. Lynch was awarded BSc and PhD degrees in chemistry by the National University of Ireland in 1954 and 1957, respectively, and followed this with postdoctoral research at the Swiss Federal Institute under Professor V. Prelog. He then worked for two years in industry in the UK and in 1961 he joined the staff of Chemical Abstracts Service (CAS) in Columbus, Ohio in the USA, where he eventually became the head of the Basic Research Department. At this time CAS were carrying out some of the earliest large-scale experiments on the use of computers for the production of both textual and chemical databases,’ and these two related areas have formed the basis for Lynch’s subsequent research career. In 1965, he came to the University of Sheffield to take up a post at what was then the Postgraduate School of
0269-4012/90/03
0179-02
Librarianship and Information Science, where he was awarded a Personal Chair in 1975. On arriving in Sheffield, Lynch’s first area of research was in the development of automatic methods for the production of articulated subject indices, such as are used in the Chemicul Abstracts subject indices. The work started with the development of routines that could identify articulation points, normally prepositions or conjunctions, in natural language sentences: these points then acted as pivotal points around which the clauses of the sentence could be moved to produce the final index entries.* The initial experiments led to the development of an operational software package that was successfully used by several commercial organizations during the 1970s.’ Shortly after the start of the subject indexing work, studies were initiated into the development of automatic methods for the indexing, storage and retrieval of chemical reactions. Particular attention was given to the development of algorithms that could compare the sets of reactant and product molecules to identify those areas which had been changed in the course of the reaction. This proved to be an extremely refractory area, and work continued on it over a period of some 12 years: a range of methods was tested, these including algorithms to identify both the similarities and the differences between reacting molecules, and using both connection table and Wiswesser Line Notation records.4 This series of projects finally resulted in the development of an efficient and effective graph matching procedure for connection table records that has provided a basis for the public and in-house reaction retrieval systems that are now becoming available.5 The late 1960s saw the start of research into the selection of fragment screens for chemical substructure searching.h The work involved a detailed analysis of the frequencies of occurrence of various types of algor-
$03.00
0
1990 Butterworth-Heinemann
Ltd
Research
‘ADAMSON,
G.W.
MCCLURE,
A.H.W.,
COWELL, TOWN,
J., LYNCH, W.G.
M.F.,
AND
YAPP,
A.M. (1973). Strategic considerations in the design of screening systems for substructure searches of chemical structure files. Journal of Chemical Documentation, 13, RP. 153-157. LYNCH, M.F. (1975). Screening large chemical files. In: Chemical information systems (J.E. Ash and E. Hyde, eds). Chichester: Ellis Horwood. ‘ASH,
J.E.,
CHUBB,
P.A.,
WARD,
S.E.,
WEL-
(1985). COmmunication, storage and retrieval of chemical informafion. Chichester: Ellis Horwood. FORD,
S.M.
‘%LARE,
AND
WILLETT,
A.C.,
COOK,
p.
E.M.
AND
LYNCH,
(1972). The identification of variablelength character strings in a natural language database. Computer Journal, 15, pp. 259-262. LYNCH, M.F. (1977). Variety generation - a reinterpretation of Shannon’s mathematical theory of communication and its implications for information science. Journal of (he American Society for Informalion Science, 28, pp. 19-25. COOPM.F.
ER,
I,.,
EMLY,
M.A.,
LYNCH,
M.F.
AND
(1980). Compression of continuous prose texts using variety generation. Journal of the American Society for lnformalion Science, 31, pp. 201-207. COOPER, D. AND LYNCH, M.F. (1984). The use of binary search trees in external distribution soiting. Znformation Processing and Manazemenf. 20. DD. 547-557. “BARNARD, J.M. (ED.) (1984). COWl~UtC’~ handling of generic chemical structures. Aldershot: Gower. YATES,
A.R.
&
“DOWNS,
G.M.,
GILLET,
.
V.J.
HOLLIDAY,
J.
(1989). Computer storage and retrieval of generic chemical structures in patents. Part 10. The assignment and logical bubble-up of ring screens for structurally explicit generics. Journal of Chemical Information and Computer Sciences, 29, pp. 215-224 ( and the nine preceding papers in this ongoing series). AND
LYNCH,
M.F.
13BRINT,
A.T.,
MANSON,
G.A.
Chemical networks. 300.
GILLET,
” .,.,
LYNCH,
M.F.,
(1988). graph matching using transputer Parallel Computing, 8, pp. 295AND
WILSON,
G.A.
‘“LYNCH, M.F. AND WILLhTT, P. (1987). Information retrieval research in the Department of Information Studies, University of Sheffield: 1965-1985. Journal of Information Science, 13, pp. 221-234. LYNCH, M.F. AND WILLETT, p. (1987). Current research into chemical and textual information retrieval at the Department of Information Studies, University of Sheffield. Information Processing and Management, 23, pp. 447-463. “LYNCH (1977), op. cit., Ref. 10.
ithmic fragment: this statistical information then formed the input to a screen selection procedure that resulted in the selection of a set of screens that occured approximately equifrequently in the file.7 Further studies evaluated the effectiveness of substructure search systems based upon the resulting screen sets, the statistical independence of screen assignments, and the relationship between query and structure characteristics inter alia.’ The use of frequency analysis and algorithmic fragment generation now underlies both in-house and public chemical substructure search systems.’ For much of the 197Os, Lynch was primarily involved in the application of the screen set methodology to textual databases. In particular, techniques were developed for the identification of character substrings occurring approximately equifrequently in a range of types of text. These substrings were successfully applied to a range of tasks including the compression of bibliographic data, search codes for online catalogues, and free text searching in serial files. “’ Since 1980, Lynch’s main focus of interest has been the storage and retrieval of generic structures, the partially defined molecules that occur in has chemical patents. ” The project resulted in an input language and a machine-readable representation that can be used for the formal and explicit description of generic structures, algorithmic procedures for the assignment of fragments to generic structures, and a range of retrieval mechanisms to allow efficient and flexible searching of files of generic structures.‘2 This work has been under active development for almost a decade and operational implementations of many of the ideas that have been developed are now beginning to see the light of day. Most recently, one of the generic searching strategies has proved amenable to implementation in parallel computer hardware, and there is now increasing interest in the design and implementation of parallel algorithms for a range of problems in chemical information retrieval.” Two recent reviews summarize the very large body of research that has been carried out by Lynch over the
notes
and it is undoubtedly last 25 years,” the case that he is best known for this work. However, during his time at Sheffield, he has also made substantial contributions to the teaching within the department. In particular, he has been largely responsible for the very substantial part that computing has always played in the Department’s taught Masters’ programmes. Online searching, spreadsheets and word processing, etc., are commonplace features of present-day information and library courses, but computer programming and database technology have been an integral part of the Sheffield programmes for more than two decades; indeed, the machines then were far, far slower than they are now, and text manipulation required students to learn assembly level languages if reasonably efficient procesing was to be achieved. Lynch’s reputation has meant that he is much in demand as a speaker both in this country and abroad. Visits have included lecture tours to China, India, Iraq and Poland and he participated in a series of highly influential summer schools in information work that were organized by UNESCO in Sheffield in 1976 and 1979 and in Vienna in 1983 and by the British Council in Sheffield in 1983 and 1986. His achievements have been widely recognized. Thus, in 1977, he was awarded the prize for the best paper in the Journal of the American Society for Information
Science,ls
and in 1980 he received the annual Award of the Institute of Information Scientists in recognition of his services to information science. In 1989, he was awarded the Skolnick Award of the American Chemical Society, which is made annually to recognize outstanding contributions to the theory and practice of chemical information science. Despite this body of work, he is always able to find time to talk with students about their work and to chat with them on an informal basis: indeed, this friendliness is likely to be one of the main memories of the hundreds of students from this department that he has taught down the years. P. Willett Department
of Information
University
Studies
of Sheffield,
UK
177