Information science and technology at the national science foundation

Information science and technology at the national science foundation

Infonnarion Pmcwing & Manogment Vol. 16, pp. 24WJO Pergamon Press Ltd.. 1980 Printed in Great Britain INFORMATION SCIENCE AND TECHNOLOGY AT THE NATIO...

889KB Sizes 15 Downloads 119 Views

Infonnarion Pmcwing & Manogment Vol. 16, pp. 24WJO Pergamon Press Ltd.. 1980 Printed in Great Britain

INFORMATION SCIENCE AND TECHNOLOGY AT THE NATIONAL SCIENCE FOUNDATION HOWARD L. RESNIKOFF Division of Information Science and Technology, National Science Foundation, Washington, D.C., U.S.A. ORGANIZATION

OF FIELD

Information science is concerned with the structure of information and its transfer, i.e. with the ways in which one system is informed by another. Information technology is concerned with the technical means available for use in effecting such an exchange. The support of research in information science and technology is accomplished through the following program areas. Standards and measures

This program supports research on measures and standards for information science. It is concerned with the definition of objective and quantitative measures of attributes of information such as quantity, complexity, meaning, utility, and value, and their relation to the structural characteristics of information and to the organization, transfer, analysis, and utilization of information collections. Related matters are the development and testing of objective measures of performance for information processing systems and human beings in the various aspects of organization, identification, analysis, and retrieval of information. Research on these topics requires the availability of standard experimental test collections to control variability and enhance the comparability of experimental results. The thread of measurement runs through each of the general areas and research directions mentioned below. The measurement of information, and the connection between information and the process of measurement, are important problems of the subject. The structure of information

This category is concerned with the study of information as idealized organization or structure from the standpoint of natural science. It is clear that particular information can be presented in a variety of ways which differ inessentially one from the other. For instance, a bold statement of fact can be conveyed in any one of the hundreds of natural languages currently in use, via print or braille or speech waveforms. Thus we conclude that some information is independent of the form in which it is presented. This idea can be generalized to the notion that information is what remains when one abstracts from the material aspects of physical reality. If one abstracts from the material aspects of printed text, then all that is left is the organization of a sequence of symbols selected from a fixed but arbitrary finite inventory. In general, the concept of information is coextensive with the notion of order or organization of material objects without regard to their physical constitution. Moreover, the scientific study of information cannot be divorced from consideration of pairs of “inventories” or systems which are capable of bearing information. In common affairs one thinks of an information data base and a user of information as exemplifying the constituents of a pair; in the physical sciences one thinks of the information elicited from an experimental situation which involves the pair consisting of an experimental apparatus and an experimental observer. Regarding the latter, the role of the observer is critical in the realm of quantum phenomena. We all know it is no different for common macroscopic information systems, yet few theories of information transfer consciously balance the role and properties of the observer (= user) with that of the data base. Research in the structure of information is concerned with these general problems and particularly with their specific realizations in information collections and access systems, in the relationship between form and content in language, with the connection between statistical theories of information and structural proper243

H. L.RESNIKCIFF

244

ties, and with the representational numerical archives. Behavioral

aspects of information

structure of information patterns in text, image, and

transfer

This category is concerned with research on the abilities and limitations of people as information processors. Keeping in mind that information transfer involves pairs of information systems, the dichotomy between inanimate and animate systems leads to three types of transfer interactions: machine-machine, machine-human, and human-human. Consideration of the first of these lies in the province of the Structure of Information program, whereas the other two raise essentially behavioral and cognitive as well as human factors questions. The general principles of cognitive processing and retrieval, including selected aspects of memory, learning, problem solving, and information pattern recognition underlie the human ability to classify, categorize, and retrieve information whether its source be collections of technical documents or the sensory channels. Furthermore, the development of effective machine-based and interactive information systems depends upon a comprehensive understanding of the human factors involved in the various interfaces between people and the inanimate information systems with which they interact. When information transfer involves more than one person interacting with an inanimate constituent of an information system, new problems arise. Large information systems are generally used by large numbers of people, with the consequence that they may be most effective for the hypothetical average user although they are not as effective as they could be for any particular user. This leads to the general area of information transfer involving more than one person and to the study of objectively measurable properties of human-human information transfer. Infometric

models

Information has been playing an increasing role in society, and has become interwoven with technology and the economy in a complex way. These interactions, and the pervasive effect of the rapidly changing microelectronic and telecommunications developments call for a more comprehensive and fundamental understanding of the role of information in economic theory, and for the development of the analytical apparatus and data which will make the theory applicable to the problems of predicting as well as tracing the impact of technological, regulatory, and other changes which affect the information component of the economy. Information technology It is a characteristic of information science that many of its research opportunities and

problems derive from advances in information technology. Projects in this category are intended to accelerate the utilization of new research results and to encourage the integration of diverse technical means into complex information processing systems. INFORMATION

SCIENCEANDTECHNOLOGY

AWARD

ANALYSIS

The Division of Information Science and Technology was established in March 1978 and its program in information science was announced by the end of that year. It incorporates certain research responsibilities of previous Foundation programs which were primarily concerned with science information dissemination. The research budget for those programs was so small, and the focus of effort so different, that prior award and funding patterns are not comparable with present ones. RESEARCHDIRECTIONS

ANDOPPORTUNITIES

Research in information science is entering a new, more broadly based phase, which recognizes common features in problems that have arisen independently in many other fields of science and integrates diverse methods and results to study them. Moreover, recent advances in information technology have stimulated research on certain old problems, opened new research areas, and are for the first time providing the opportunity to collect and analyze large quantities of observational and experimental results which will form the foundation for the creation and validation of future theories and their application to practice.

Information science and technology at National Science Foundation

24s

Funding levels-$ million Information science & technology

1980 1979

Standards and measures Structure of information Behavioral aspects of information transfer Infometrics Information technology

l,~,ooO

1,004,500

1,650,OOO

1,028,600

1,300,000

1,125,oOO 125,000

I,03 1,500 1,345,oOO -

Totals

5,200,OOO

4,409,600

Information Science Award Analysis-FY Total FY awards Special research initiation awards for new investigators Proposals received in FY (thru August 31) Percentage distribution of awards New starts New awards prior support Continuing grant commitments Supplements Major items Research initiation awards Percentage distribution of funds New starts New awards prior support Continuing grant commitments Supplements Major items Research initiation awards

1979 40 10 103 52.5% 15.0% 5.0% 2.5% 0 25.0% 62.4% 18.3% 6.1% 0 0 13.2%

The recently constituted Advisory Committee for the Division of Information Science and Technology is in the process of identifying core problems for research in the field in relation to the program elements. Although these efforts are not yet completed certain research opportunities have become apparent. The areas of special opportunity described below are those where significant progress can be expected and is likely to have practical as well as theoretical consequences. A dominant feature of archival, biological, and computer information systems is their size. The quantity of information which must be processed and stored by a system governs the structural organization which is necessary to permit the information to be accessed, analyzed, and used. Recent progress in the theory of file organization and archival access has set the stage for development of a more comprehensive theory of the optimal organization of general information access systems. Hierarchical organization is one form of system structure which is frequently found in nature and used in practice. It has been shown that this type of structure has various optimal features. The Division hopes to increase support for research in this direction. The problem of the selective omission of information occupies a central position in IPM Vol. 16. No &Z-F

246

H. L.

Rmwmtt

information science. It can be thought of in the following terms. If human channel capacity were infinite, then whenever it were desired to locate information in an archive or memory, the entire collection could be read in real time and there would be no information access problem. But because channel capacity is finite, it will not in general be possible to process a large archive or other information collection in real time. This means that some of it must be omitted or ignored. Thus the question of what should be ignored becomes a central issue. Indexes and abstracts of books and documents provide a familiar class of examples: their purpose is to identify the essential or characteristic content of the source by selectively omitting most of it and reformulating what remains. The human sensory information systems, such as the vision modality, automatically perform a comparable process of selective omission for similar reasons. Research concerned with revealing the principles which govern the selective omission of information in the creation of indexes and abstracts and also in the processing of sensory information is of special importance. The semantic relationship of an abstract to the original text raises special questions concerning the connection between the syntax and semantics of language which have been actively studied for more than twenty years. This is an area which requires continuing attention and which can benefit from more comprehensive experiments now possible as a result of the decreased cost of computing and machine readable storage. Related to the problem of selective omission is the question of how knowledge is represented in thought and in language. This is a topic of research which has drawn increasing attention in the cognitive sciences and artificial intelligences it plays a central role in both basic and applied information science. The relationship between linguistic and non-linguistic (e.g. image) representations and the perception of categories play a central role in problems of knowledge and fact retrieval. Research on these topics will be intensified. A new thrust in this area will be aimed at studying representational structures for images such as maps, engineering drawings, graphics in patents. etc. with the goal of being able to manipulate their semantic content. A related research direction concerns the interaction between people and inanimate information systems. Problems of “impedance match”, or “human factors”, or “transparency” of the interface are amongst the more applied aspects of the larger and fundamental questions which concern pairs of interacting information systems. The ability of an information system to be responsive to an inquiry is a function not only of its structure but also of the information structure of the querying person or system, including the knowledge representation used by each of the pair, and cha~cterizations of information system pe~or~nce are necessarily relative characterizations. An important application of research results is to the problem of providing specialized information in inter-disciplinary contexts in response to queries by non-specialists by adapting the knowledge representation of the system to that of the individual. The Foundation has an opportunity to play a special role in advancing these developments by coordinating activities of the Computer Science Section of the Division of Mathematical and Computer Sciences, the Cognitive Science Program of the Division of Behavioral and Neural Sciences, and the rnformation Science and Technology program. The well known Shannon measure of information quantity depends only on the frequency or occurrence of informational units but not on any higher order or other structural features. An important direction for research is to study measures which reflect higher order structures, and also to discover measures of intellectual content, simplicity, and utility. Along these lines recent attempts to formulate theories of approximate reasoning which incorporate and provide measures for the necessarily imprecise modes of reasoning which are appropriate when possibility rather than probability of events is considered should be continued, and their connection with results of cognitive psychology explored. The difficulties of measuring the performance of algorithms or assessing the adequacy of competing theories concerned with information are great. They are related to the problem of discovering appropriate measures but also depend on the availability of suitable test collections for observing the natural phenomena associated with information systems and for performing appropriate experiments. Scientific fields which possess instrumentation which extend human powers of experimental observation progress much more rapidly than those which do not. The dependence of astronomy on telescopes, and of numerical analysis on computers, are examples.

Information science and technology at National ScienceFoundation

247

Until the recent developments in information technology the most powerful instrument available to the researcher concerned with linguistic aspects of information science was the dictionary. Recently progress has been made, but the opportunity and the need coexist to provide special data bases, text collections, and analytical tools which will magnify our ability to observe the salient properties of information. Information, especially scientific and technical information, has been playing an increasingly important role in society, and has become interwoven with the economy in a complex way. These interactions, and the pervasive societal effects of the new microelectronic and telecommunications developments call for a greater effort to understand the role of information in economic theory. It may be that functional measures of information utility and measures of the economic value of information can be related. The research opportunities in this area span a broad range from the most basic to the quite practical. A new thrust is the development of models intended to make it possible to predict as well as assess the impact of technological, regulatory, and economic change on the information component of the economy. Although there is little hope that so ambitious an objective can be achieved in the immediate future, the policy decisions concerning scientific and technical information which affect society and the economy are not deferrable; consequently this research direction has a special practical urgency. Information science is, more than many other sciences, technology based. There are at least two reasons for this. One is that economics is forcing the replacement of traditional means for storing, transmitting, retrieving, and analyzing information by new technological means. The other is that new technology is making it possible, in principle at least, to perform valuable new types of information analyses (including direct support of intellectual activity) as the size of machinable information stores increases and the ability to execute complex real time algorithms begins to rival the brain’s performance. Rapid advances in microelectronics, telecommunications, and mass storage capabilities are generating demand for new information technology such as display formats, data base formats, voice inputs to computerized systems, and text recognition techniques. Each of these examples depends upon the solution of specialized research problems, and each represents a significant research opportunity. The integration of the results of research with technology to develop information systems capable of operating with scientific symbols, chemical structure formulae, diagrams and other structured image data, as well as text offers a challenge parts of which can now be accepted. This is a high priority opportunity for the information technology program of the Division. How to transfer research results and apply new technology to ameliorate the problems and enhance the capabilities of research libraries are questions of long standing. The Division is currently assessing partial support for the research program of a proposed non-profit library research institute whose purpose will be to perform research, development, and test activities to transfer new technology to the research library environments. The Foundation’s role would be to provide seed money in the form of matching funds for an initial period. The library research institute would constitute a focus for a critical mass of researchers and technicians, and would provide a relatively risk-free environment for evaluating the cost-effectiveness of various innovations. The problem of preservation of archival material by technical means has now assumed severe proportions. Government and other rapidly growing paper based archives are also rapidly deteriorating. There are two fundamentally distinct types of preservation activities which can be undertaken, one of which involves preservation of the document and the other preservation of its content: each of these has a significant information technology research component. The preservation of the original physical document is based on research in paper chemistry. This method primarily provides research needs and opportunities for Foundation Divisions concerned with materials research and chemistry. The second preservation technique generally utilizes high density photographic storage methods (certain materials problems also arise in this case) or high density machine readable storage methods based upon magnetic or optical principles including holography. In these cases problems of indexing and classifying the stored material so that it can be retrieved are formidable and constitute both a challenge to and an opportunity for research on indexing and more generally, on knowledge representation.

H. L. RESNIKOFF

248

INFRASTRUCTURES

Federal and private sector support for research The Foundation’s program is the sole source of non-mission oriented federal support for research in information science. There are no identifiable continuing basic research programs in the industrial sector. Those organizations which maintain research capabilities such as the Institute for Scientific Information, the Chemical Abstracts Service, SRI. Battelle, etc. seek support from the Foundation through this program. There is a considerable industrial investment in the development and application of new information technology for areas with high market potential such as entertainment. Thus, the video-disc is being developed for entertainment purposes but not for information/libraries applications. The interdisciplinary nature of information science produces certain interactions with other disciplines. Information science relies heavily upon research conducted in these disciplines as the basis for understanding information problems-its analysis. storage, retrieval, and use. Some of the most interesting information systems, for example, are biological ones and there are important information science research problems which stem from them. But this should not be interpreted to mean that this Division’s interests encompass the psychological and neural sciences. There are important clues for the design of more efficient information processing systems to be found in the sensory coding and cognitive processing of information which takes place in biological systems. We look to the research supported by the Foundation’s Division of Behavioral and Neural Sciences (BNS) for the fundamental understading in these areas. Research on the structure of language as it is conveyed by human speech is supported by the Linguistics Program, BNS. This is directly related to information science research on the structure of textual materials. The core research on computer technology and intelligent systems, so i.mportant to information science and technology, as well as pattern recognition and representational control of coordinated movement, e.g. production systems, is supported by the Computer Science Section of MCS and the Division of Engineering in EAS. Measures of information utility involves research in decision-making under conditions of uncertainty; some fundamental work is supported in the Applied Mathematics and Statistics Program and Computer Sciences Section of MCS. Finally, our interest in infometric models of scientific and technological information is closely related to the more general economic models supported by the Economics Program in the Directorate for Biological, Behavioral and Social Sciences (BBS). SPECIAL

CONSIDERATIONS

The Division has begun a program of Special Research Initiation Awards for New Investigators in Information Science to increase the rate of research innovation in the field. Several special working groups, in collaboration with the Division’s Advisory Committee, are attempting to determine those research directions which address the most pregnant problems and afford the best opportunities for Foundation support. This activity will be intensified during the coming years.

APPENDIX

RESEARCH IN INFORMATION The National Science t:oundalion supports basic and applied research in information science and on information problems. In the selection of projects to be supported, preference is given to research which is fundamental and general and to applied research which is concerned with scientific and technical information rather than, for example. with business information or mass communication. The development of hardware is beyond the scope of this program, as also are projects to develop, implement, or evaluate information systems except for the purpose of generalization beyond the particular information systems involved

SCIENCE PROGRAM

The principal mation Science l

l

GOALS

goals of the Foundation’s and Technoloev ~-I Program o

Inforare:

To increase understanding of the properties and structure of information and information transfer. To contribute to the store of scientific and technical knowledge which can be applied in the design of information systems.

These goals categories.

are

addressed

by four

research

Information I.

science and technology

CATEGORIES

Research proposals in four categories sidered hy the Division of Information and Technology: l

Standards

=

Structure

l

Behavioral

l

lnfometrics.

OF RESEARCH are con-

Science

and Measures. of Information. Aspects

of

Information

Transfer.

A proposal may address problems which involve more than one of these categories. Proposals which address problems outside these categories but are consistent with the objectives and priorities of this announcement will be considered. STANDARDS

AND MEASURES

Progress in any field of science dependson the existence of recognized and theoretically founded measures of fundamental quanfifies and of standards for assessing the predictions of theory and comparing the results of experiments. This cafegory is devoted to proposals for their further development for information science. Particular research problems of interest are, for exam&e: l’ormal characterization of the essential properties of information systems. Interdisciplinary application of fokmalized concepts and principles of information structure and process. Pdrti~ularly encouraged is work that applies to both living and machine-based systems. Definition of objective and quantitative measures of attributes of information such as quantity. complexity, meaning, utility, and value. but not those aspects which concern computational complexity or measures of cap~rcity of electrical communication systems. Relation of such measures to the structural characteristics of information and to the organization. transfer, interprefation. and utilization of information collecfions. Development and testing of objective measures of performance of algorithms and of human beings in the various aspects of organization. identification. and retrieval of information. Definition and testing of standard experimental enviroments and test collections to control vari,lhility and enhance the comparability of , experimenlal results

249

at National Science Foundation

SUPPORTED

tion processors. The main directions for study include: Investigations of human information processing including those aspects of learning. memory, problem solving and pattern recognition that are relevanl lo information processing principles. Investigations of the interface between human beings as information processors and the inanimateinformationsystemsandsourceswith which they interact. including but not limited to text. computer terminals, and video or microform displays. The human factors of information processing capacity are of interest especially in connection with measures and methods of effective performance. Investigations of those aspects of information representation which admit some generalization and abstraction from specific biological mechanisms. Problem areas include: organization of information by the sensory system: nafure of categorical representations: information filtering and rejection of non-essential content. More specific problems include information interactions, such as representation and retrieval. with large dafa bases; information understanding as a blend of artificial intelligence and cognitive psychology; and the analysis of decisions thai pertain to information processing strategies.

STRUCTURE OF INFORMATION This category is concerned with the study of inform;*tion as idealized organization or structure from the st;ln(il~oint of natural science. Research topics include: The structural properties of information collections iInrl of access systems. Connections between form and content (‘syntax’ and ‘semantics’] and the extent to which the former determines or can substitute for the latter. Rel~Iti[)nsh~ps between the structurr! of IXguage imd informalion. S!alistical theories of information. their limitalions, and their relationship to other scientific theories How patterns of information in texl. image. and numerical archives or files can be algorjthmi~~llly identified and automatically recognized

I:r~rm;~l tific

chilrIICt~:riZilti(ln

and

technical

IO production appropriate notions

x~rl

economies

of scale.

Ihe

Comparative

as well

versusinformational

BEHAVIORAL

ASPECTS OF INFORMATION TRANSFER

This category is concerned with research on the abilities and limitations of people as informa-

ing,

information-rich

substitutabilily

non-informational

(labor)

and equipment workforce

equipment.

of information

information

and

and nations.

as informational

to tlie analysis

cerning

aspects

structures. and

and results

Analyses

of the

use of the same

industries

Application anisms.

goods

versus

of the workforce

[capital),

including

adaptation

and sllhstifution~

market

flexibility

informational

parts

or

of scicn-

as an ‘input’

and joint

by firms,

resulting

role

a producf.

public

externalities.

information

of

as

a~~f)li~~lii(~n

of ~om~)lementarity

othtir and

of the

information

science

of decision systems

concepts

of economic processes in a rapidly

environment.

mechconchang-

lnfometric

Modats

Government decisions. many of them nondeferrable, which concern scientific and technical infrcrmation and affect the economic vitality uf

Ihe

INation

are

continuaily

mode.

In

view

II.

of

l

of

th\:se matters on boih practical and theoretical grounds, the I:oundation will suf*porn fessarch c~ntr~l)~itin~ IO the long-range goal of forrnli~~ltin~ (lll~lntit~~ti~,(~models of the ‘infrrrmation t~nomy‘. Such models would represent the creation and flow of scientific and technical information as well as those ;~spects of economic structure and organization which impingt* on inr:ttntivtts that p;~r~ir:~~l.~rl:\ dft:c~ this creelmn anti flow. m? nbjech~t: of siich mo(iei~n~ would IIC: to describe and expl,iin known phenomena :Intl to provide the mei,ns to identify nnd estimate the magnitude of future consequences of policy the importance

alternatives. Research problems, which may over1311to s*me extent, include:

* *

l

Modeling. at various levels of aggregation. of information flow in the economy. including complete and partial (sectorall models. Development of relevant measures and empirical techniques. l)~~~~~l~~~~~~ent and ~~ilill~~tiofl of models of the r?xislin# inf~)rmati~ln system and of its princiltctl crlnstltuents, I’rt:t~aration and coll~~ctionuf input data (from existing sources and by special surveys as norxssaryj for running models and for lest of mt~iels IO stimulate the response of the infortnzrtirm economy to ~~ojectetl or proposed ch;tnpr:s in the conditions umlcr which it ctpt:wtes.

SPECIAL RESEARCH 1NlTlATlON AWARDS FOR NEW I~VEST~GATGRS IN INFORMATION SCIENCE