An expert system study for evaluating technical papers: Decision-making for an IPC

An expert system study for evaluating technical papers: Decision-making for an IPC

Pergamon 0952-1976(95)00058-5 EngngApplic.Artif. lnteU.Vol.9, No. 1, pp. 33-41, 1996 Copyright© 1996ElsevierScienceLtd Printed in GreatBritain. All r...

722KB Sizes 3 Downloads 87 Views

Pergamon 0952-1976(95)00058-5

EngngApplic.Artif. lnteU.Vol.9, No. 1, pp. 33-41, 1996 Copyright© 1996ElsevierScienceLtd Printed in GreatBritain. All rights reserved 0952-1976196$15.00+ 0.00

Contributed Paper

An Expert System Study for Evaluating Technical Papers: Decision-making for an IPC BORIS TAMM Institute of Cybernetics, Estonia

(Received February 1995; in revised form September 1995) Evaluation of scientific contributions is a typical expert task that can be formalized to only a moderate extent. Therefore, the International Programme Committees collect the opinions of different experts, leaving the f n a l decision to one expert or a small group. The aim of this paper is to model the problem domain, which does not fit into ordinary fixed or probabilistic knowledge-base structures. In this case, the knowledge must be derived and measured by some robust structures which satisfy the expert reasoning. Methods for structuring the rule base and the data dictionary, as well as logical distances between the values of the decision factors, are discussed. Keywords: Knowledge bases, decision structures, data dictionaries, linguistic variables, logical distances.

I. INTRODUCTION Structured analysis methods are often used to describe a problem domain which is primarily related to a particular class of reasoning processes. This applies to the cases where the initial information is derived from cognitive measurements. H e r e , a numerical algorithm can hardly be written, or an analytical model constructed, to find the necessary decisions in the domain decision space. Those decisions can, however, be obtained as a result of some inference process based on certain initial information and reasoning rules. 1 By structured analysis one can, first, structure the data, i.e. extract the variables from the initial information, define them and specify the values that can be assigned to those variables. Furthermore, one can find the factors which can influence different variables in different reasoning situations. Those factors, which are also variables, can in turn be assigned different values. The methods to relate the factors and variables to each other in a decision-making process give the set of relations between the structured data. As a result, a structured set of structured data, called a knowledge base, is obtained. Second, certain rules can be found, which lead the reasoning process toward obtaining the necessary decisions. Certain computational models must be specified for processing the data, taking into Correspondence should be sent to: Professor B. Tamm, Institute of Cybernetics, Tallinn Technical University, Akadeemia tee 21, EE0026 Tallinn, Estonia.

consideration the relations between them, to formally reach valid decisions. Thus, a knowledge base can be designed for a particular domain, where: (a) structured data, (b) a set of relations, and (c) reasoning rules or computation models 2 are specified. Next, a structured model of the problem domain is designed. Those models can be different and very complicated, but one must always start from a robust and transparent prototype and continue towards a more detailed representation. In the reasoning processes, where all the information is available from the beginning, this model often takes a hierarchical graph form. Usually it does not contain loops and feedbacks in an explicit manner. However, it does not avoid them in the data-flow diagrams of computation models and inference processes. By specifying the variables, their values, relations and the kind of decisions involved, a great deal has been accomplished. Now, as shown in the example, the domain graph can easily be developed. To complete an expert system for a complex intelligent reasoning domain, it is necessary to design an inference engine. This carries out the intelligent part of the reasoning, i.e. the actual synthesis of computational models, relations and data in every particular reasoning situation. The inference engine has to be equipped with logical mechanisms to deal with data-flow diagrams, which are an essential part of a precise process model. Besides data-flow diagrams (DFD), there are statetransition diagrams (STD), entity-relationship dia33

34

BORIS TAMM: E V A L U A T I O N OF T E C H N I C A L PAPERS

grams ( E R D ) and other diagrams, which are all parts of different process models (essential model, environmental model, behavioural model, implementation model, etc.). However, these are not discussed in this paper. Data-flow diagrams represent every possible data flow, including internal automatic knowledge acquisition to improve decision making. 3 This means that the system collects the information about "fortunate" and "unsuccessful" decisions taken in the past, and corrects the variable values or relations according to some rules. This accomplishes the learning possibilities of the system. It is appropriate to mention here that the development of expert-system technology is related to new robust modelling methods in a similar way as H-inf and H2 control are related to new robust control methods. Professional people are faced with complex problems where the usual analytical, numerical and other deterministic methods are not helpful, and where it is necessary to measure and evaluate properties that cannot be established by ordinary metrics. In these situations use must be made of linguistic variables, fuzzy sets, semantic algorithms, and a different kind of prototyping which is also involved in robust modelling. Instead of designing a correct, optimized, analytical model of the system, it will be necessary to improve its robust model by iteration, by collecting new knowledge, and by applying new technology until it is acceptable in terms of economy and quality. The past 15 years have proved the increasing success of this approach. 4

2. THE PROBLEM

Assume a situation of a member of an International Program Committee (IPC) of a conference, who has received a package of contributed papers to be evaluated. The purpose of the evaluation lists distributed by the chairman of the committee is to evaluate the suitability of a paper for a particular conference. This evaluation list also contains a request by the chairman to specify the degree of suitability as: good, acceptable with some improvements by the author, or unacceptable. Usually, however, the evaluation list also contains more information; in particular, the chairman requests an evaluation of novelty, clarity and format. The terminology of evaluation varies; for instance, instead of GOOD, IMPROVABLE, or UNACCEPTABLE, Accept, Discuss, or Reject could be used. These three may in turn be distributed across a wider spectrum: Others may decide, or Outstanding, and the like. This, however, does not significantly change the decision-making task of the IPC. Thus, the initial information is provided in the manuscripts of the conference papers by the authors, according to certain rules. The task is described in the evaluation list. Now the question arises as to whether

Z'Format~

Suitability

ooD] PR/

UNACJ Fig. 1. Draft decision structure.

expert-system technology can be applied to this task. The answer is that this is indeed the case, provided that the evaluator is a real expert in the conference topic, s 3. THE KNOWLEDGE-BASE STRUCTURE

Next, the IPC member, an expert in the field, starts to structure the decision-making procedure. He/she will realize immediately that the IPC chairman has contributed to structuring his/her task, as the final Suitability variable may have three different values, and is related to three factors: Novelty, Clarity and Format. The evaluator can start to draft the decision structure

(rig. 0. Here one can refer to other possible attribute names that make up the Suitability, such as Relevance, Technical Quality and Originality. Regardless of the terminology, however, the ultimate goal is to capture the integral value of each paper using a set of attributes appropriate for that task. To reach the integral value of the paper, an expert has to identify the factors that make up the Novelty, Clarity and Format. Here, experts' opinions may differ. That is one of the vagaries of the whole evaluation process, as accomplished by several independent referees. Opinions may differ because of varying structuring of the decision domain, i.e. the referees may give different weights to the different properties of the papers presented. Then, the logical distances, discussed in the second part of this paper, should be measured. In most cases, the experts conclude that the Novelty captures two main factors: new ideas or new formal results. Clarity implies the clarity of wording and expression, as well as the Integral clarity. The Integral clarity refers to whether the paper is clearly focused on the topic. In addition, at international conferences, another important component of clarity is the Language. This must be treated separately. Format usually means whether an article is furnished with a sufficient number of references and well explained Illustrations. The crucial question related to each paper is whether the paper is within the Scope of the conference. If this is not specifically requested, it has to be a factor of the Format variable, as shown in the present example. Thus, two factors of Novelty have been discussed, and three factors each for Clarity and Format. As a

BORIS TAMM: EVALUATION OF TECHNICAL PAPERS

35

New formal results Novelty New ideas Clarity of wording Integral clarity

Suitability

Clarity

Language References Illustrations

Format

Scope Fig. 2. The graph is a data-flowdiagram of the knowledge-basestructure.

result, the decision structure graph, can be designed as shown in Fig. 2. The decision graph shows how different decisions or subdecisions are related to the corresponding factors. It also demonstrates that the problem domain decision space has been decomposed into four mini-expert-system decision spaces, showing how the data flow is organized at each decision-making step. However, the qualitative properties of the decisions have not yet been covered. Therefore, the designer has to assign a proper number of values to each decision and factor. As referred to earlier, three values of the final decision, Suitability, are given by the IPC Chairman, so the rest of them have to be specified. Based on the earlier assumptions about the term Novelty, it can be concluded that values like NONE, FAIR, and GOOD would cover the spectrum of possible choices. In terms of Clarity, three steps seem to be sufficient: POOR, IMPROVABLE, and GOOD. Here, the second choice refers to cases where the referee can give some useful advice to the author to improve the paper, so as to comply with the IPC's requirements. For Format, the binary values ACCEPTABLE and UNACCEPTABLE seem appropriate, although it will be shown later that other solutions might exist. In the same way, values must be assigned to the group of factors on the lefthand side of the data-flow diagram in Fig. 2. This is a typical task for a system expert (if necessary, in cooperation with a knowledge engineer), and the results for the example are self-explanatory. For that reason, they are simply displayed in Table 1, rather than discussed in detail. It should be added that although the proposed set of the given values will probably satisfy most experts, exceptions and some diversity of opinions in the evaluations might occur. But that proves the

robustness, and emphasizes the functions of expert systems. 6 Having determined the decision structure and the data-flow diagram, one has to recognize that in such a decision structure, the number of combinations of values is equal to the product of the number of the values that each factor can take on. In other words, if an attempt were made to treat each of the factors in combination with the others, the total number of possible combinations in the decision space would be exactly 32 x 33 × 23 × 33 = 52 488. So, even in this seemingly simple problem domain, the user is faced with a nontransparent variety of combinations, and efforts must be concentrated on minimizing this number. In fact, this process has already started with the structuring of the problem, and its decomposition into four minisystems. Now the number of combinations of values is the sum of the combinations of values for each mini-system, which is 32+33+23+33=71, i.e. the complexity of the decision space has been reduced by a factor of about seven hundred. The figure 71 indicates that the rule base can be captured by 71 rules, which then guarantee the computation of all possible decisions at all possible values of the decision factors. Seventy-one is still not a small number, and the number of rules can be further decreased, but this requires the introduction of some metrics to the vari-

36

BORIS TAMM: EVALUATION OF TECHNICAL PAPERS Table 1. Decision structures for mini-expert systems Decision Suitability Choices GOOD ACCEPTABLE WITH MINOR IMPROVEMENTS REJECTED Factors Novelty Values NONE FAIR GOOD

Clarity

Format

Values POOR IMPROVABLE GOOD

Values ACCEPTABLE UNACCEPTABLE

Decision Novelty Choices NONE FAIR GOOD Factors New formal results

New ideas

Values ORIGINAL SEMIORIGINAL NONE

Values ORIGINAL SEMIORIGINAL NONE

Decision Clarity Choices POOR IMPROVABLE GOOD Factors Clarity of expression Values POOR AVERAGE GOOD

Integral clarity

Language

Values POOR AVERAGE GOOD

Values POOR AVERAGE GOOD

References

Illustrations

Values ADEQUATE INADEQUATE

Values SUFFICIENT TO BE IMPROVED UNSATISFACTORY

Decision Format Choices ACCEPTABLE UNACCEPTABLE Factors Scope Values YES NO

a b l e s a n d values, c o m p u t a t i o n o f i n e q u a l i t i e s a n d , in m o r e c o m p l i c a t e d cases, also t e a c h i n g the r e a s o n i n g mechanism.

4. THE DATA DICTIONARY B e f o r e c o m p i l i n g the rule b a s e , it is n e c e s s a r y to edit the d a t a d i c t i o n a r y , which defines the i n f o r m a t i o n cont e n t o f e v e r y d a t a flow on the g r a p h in Fig. 2. T h e d a t a

for all the m i n i - e x p e r t systems m u s t be s t r u c t u r e d in a uniform manner, because although they work independ e n t l y , t h e y all s h a r e the s a m e data. U s u a l l y , struct u r e d English a n d s o m e o t h e r c o n s t r u c t s of s t r u c t u r e d p r o g r a m m i n g a r e a p p l i e d for this p u r p o s e . 7 F o r the e x a m p l e used in this p a p e r , the d a t a dictionary is shown in T a b l e 2, w h e r e the i n f o r m a t i o n i t e m s are split into t h r e e d i f f e r e n t t y p e s - - d a t a s t r u c t u r e s , d a t a e l e m e n t s , a n d values of d a t a e l e m e n t s . So, for instance, the Novelty factor is a d a t a s t r u c t u r e specified

BORIS TAMM: E V A L U A T I O N OF T E C H N I C A L P A P E R S

by two data elements: New Formal Results and New Ideas, which are defined in terms of their values. An equality sign in the equation means that the item on the left consists of, or is intended to be, whatever is on the right. Information between the quotation marks on the right-hand side is a content comment about the definition, but not the definition itself. The definition is inside the square brackets, and may consist of values of a data element, or data elements, when defining a data structure. The values are written in the upper case, and each data element can take only one value at a time. Both the data structures defined by the data elements (on the tight-hand side), and the data elements defined by their values (also on the tight-hand side) are listed on the left-hand side of the equation.

5. THE RULE BASE The final task is to design the rule base for each mini-expert system. There are several ways of accomplishing this; however, the most appropriate for this decision space seems to be the so-called IF-THEN type of production rules. The number of rules has been reduced to 71. As mentioned earlier, one could continue decreasing the

number of those rules by applying various methods, and finally write the remaining rules. However, this requires some specific techniques and is beyond the scope of this paper. Therefore only some excerpts from the mini-expert systems' rule bases will be discussed, to show what they are like, and how they can be derived from the other parts of the knowledge base. Some examples of the mini-expert systems' rule bases are given in Table 3.

Mini-expert system Novelty (1) Rule 1 is self-evident. (2) Rule 2. If one of the two requirements of this rule is satisfied, then the Novelty could be considered as GOOD. Here it can be seen that if the rule had been written, not as an equality but as the satisfaction of at least one condition, the first rule could have been eliminated. Mini-expert system C/ar/ty (1) The first rule indicates that the evaluation of the Integral clarity as GOOD implies that the clarity of wording as well as the language could be improved by the author, beating in mind the recommendations of the evaluator. In that

Table 2. Data dictionary Clarity

= "Decision based on Clarity factors"

[POORIIMPROVABLEIGOODI Clarity factor

= [Clarity of expression, Integral clarity, Language]

Clarity of expression = "Is the wording of sentences presented in a clear style and form"

[POORIAVERAGEIGOOD] Formatting

= "Decision based on Format factors" [ACCEPTABLEIUNACCEPTABLE]

Formatting factor Illustrations

= ]References, Illustrations, Scope] = "Do the illustrations contribute to the clarity of text" [SUFFICIENTITO BE IMPROVEDIUNSATISFACTORY]

Integral clarity

= "Is the paper written clearly as a whole"

[POORIAVERAGEIGOOD] Language

37

= "The quality of the language used by the author"

[POORIAVERAGEIGOOD] New formal results

= "Are there new formal results presented" [ORIGINALISEMIORIGINALINONE I

New ideas

= "Are there new ideas presented" [ORIGINALISEMIORIGINALINONE]

Novelty

= "Decision based on Novelty factors"

[NONEIFAIRIGOOD] Novelty factor

= [New formal results, New ideas]

References

= "Are the references adequately presented" [ADEQUATEIINADEQUATE ]

Scope

= "Does the paper fit with the scope of the event"

[YESINO] Suitability

= "Decision based on Suitability factors" [ G O O D I A C C E P T A B L E WITH M I N O R IMPROVEMENTSIREJECTED]

Suitability factor

= ]Novelty, Clarity, Format]

38

BORIS TAMM: EVALUATION OF TECHNICAL PAPERS Table 3. Examples of the mini-expert-system rule bases

Novelty 1. IF New Formal results New ideas THEN Novelty

is ORIGINAL is ORIGINAL is GOOD

probably worth establishing some feedback to the author, to let him/her improve the general clarity. In both cases, the referee has to show his/her qualification as a domain expert, and consult with the author on how to improve the result.

Mini-expert system Format

2. IF New formal results New ideas THEN Novelty

is NONE is ORIGINAL is GOOD

Clarity I. IF Clarity of expression Integral clarity Language THEN Clarity

is AVERAGE is GOOD is AVERAGE is IMPROVABLE

2. IF Clarity of expression Integral clarity Language THEN Clarity

is GOOD is POOR is AVERAGE is IMPROVABLE

Format

1. IF Scope References Illustrations THEN Format

is YES is ADEQUATE is UNSATISFACTORY is ACCEPTABLE

2. IF Scope References Illustrations THEN Formal

is NO is ADEQUATE is SUFFICIENT

(1) The first rule represents a situation where the primary requirement, Scope, has a satisfactory value and only one of the remaining two, less crucial, factors is unsatisfactory. This is obviously insufficient to reject the paper, and for this reason the expert usually accepts it with a recommendation to the author, emphasizing that the illustrations must be improved. (2) In the second case, the submitted paper seems to be outside the scope of the conference. That means that despite any merits of the other factors, it cannot be accepted for this event.

Mini-expert system Suitability (1) The situation of the first rule follows from the second rule of Format, where the paper was considered unacceptable because of falling outside the scope. Since the reason here is the same, regardless of the values of other factors, the paper is rejected. (2) The second rule represents a typical "promising" case where the expert can guess that, with certain minor improvements, the manuscript could be changed into a good paper. The comments on the rules taken as examples from different mini-expert system rule bases illustrate the expert's reasoning process, and its formalization.

is UNACCEPTABLE

6. THE WEIGHTING SCALES

Suitability l. IF Novelty Clarity Format THEN Suitability

is GOOD is GOOD is UNACCEPTABLE is REJECTED

2. IF Novelty Clarity Format THEN Suitability

is FAIR is IMPROVABLE is ACCEPTABLE is ACCEPTABLE WITH MINOR IMPROVEMENTS

case, the evaluator would write some advice on the lower part of the evaluation sheet. (2) The problem with the second rule is POOR Integral clarity. However, since the wording of particular expressions is correct and the language is not the factor causing rejection, it is

One can proceed further with formalizing the decision-making process by assigning some expert scales of weight in a numerical form to the values of the data elements. This is indispensable when dealing with large numbers of rules and alternatives. Assigning numerical values introduces a concept of logical distances between the decision factors and the rules. If a data element can take three or more values, the weighting scales can express the nonlinearity of changes in the element. 7, 8 The decision structure Suitability consists of three factors Novelty, Clarity and Format, which are not quite equal in the context of selecting the best papers. Naturally, Novelty comprises the most important scientific or technical value of a paper, especially if it contains really new and original ideas. For this reason the IPCs sometimes use EXCELLENT, over and above GOOD, referring to outstanding papers and even allowing some of the other properties being evaluated to be discounted.

BORIS TAMM: EVALUATIONOF TECHNICALPAPERS The following sections of the paper discuss the four mini-expert systems with respect to possible values of factors or/and data elements.

39

6.2. C/ar/ty

Clarity is a very important factor in both written and oral presentations. It determines the level of a confer6.1. Format ence, or the sales figures of the conference proceedings. The factors for Clarity are nearly of the same character, The specification of the numeric values for a variable but their values could be identified by different experts factor or data element is, again, an expert task. in different ways, and with different weights. Consider some examples: first, the decision structure So, for instance, for the reason mentioned, the best for Format. The choice depends on the following three value [GOOD] of the Integral clarity might be evafactors, which can take the values: luated higher than the same [GOOD] for the Clarity of Scope [ YES NO ] expression because the overall clarity of each paper has References [ A D E Q U A T E INADEQUATE] a priority over the other aspects of clarity. This emphaIllustrations [ SUFFICIENT TO BE IMPROVED sizes the importance of clearly defining a problem, the UNSA TI FACTO R Y ] method used, and the results. This instruction is parIn the given decision structure, the first factor, ticularly appropriate to young authors. Experienced Scope, is predominant because if it takes the negative evaluators know that sometimes papers read fluently value [NO], then regardless of the values of the other but may seem somewhat vague; it is difficult to underfactors, the decision for Format must also take the stand the beginning and the end--the input and the negative value [UNACCEPTABLE]. The other two output. In this case, the expression is proper but not factors, References and Illustrations, are both relevant necessarily the whole opus. Next, one can discuss the Language factor. Based on to the format of the integral quality of the paper, and the same approach, one can probably distinguish a could both easily be improved by the author before nonlinearity between the lowest value and the other submitting the final copy of the paper. Such improvetwo. This is merely because the upper two will not spoil ments are an important goal of the evaluation process, the technical consistency of the paper--[AVERAGE] accomplished through the feedback to the authors. In can be improved by the author--whereas [POOR] papers related to engineering sciences and technology language is usually the reason that will make it difficult (but also in many other areas) illustrations have an to accept a paper. important role while explaining novelties to the reader. As a result of this expert reasoning in order to specify Therefore it seems appropriate to give three different weighting scales for the Clarity factor, the following values to this data element. decision structure is obtained: As a result, the following numerical weighting scales can be assigned to the values of Formatting a decision POOR A V E R A G E GOOD structure Clarity of expression 1 2 3 Scope 4 References 1 Illustrations 2 i.e. Format takes the

0 0 1

Integral clarity Language O,

value U N A C C E P T A B L E IF Z F < 4 ,

it takes the value A C C E P T A B L E IF EFt>4. Usually IF 4~
Format = O{UNA CCEPTA BL E}314{ACCEP TA B L E}7 It can be seen that the examples discussed in the rule base paragraph take the values 5 and 3, respectively. Thus, Format is a veto-type factor. If the topic of the paper is outside the scope, then it cannot pretend to be a participant in this contest, no matter how excellent it is. But if this is good, then Format does not significantly influence the final Suitability decision.

1 0

3 2

3 3

According to this, the inferior and superior boundaries for the numerical values of Clarity can be established:

Clarity = 2{POOR}516{IMPROVABLE}819{GOOD}IO Considering the two examples of the rule-base excerpt for the Clarity mini-expert system, it can be seen that the first obtains 8, and the second 6 points. They therefore belong to the space of I M P R O V A B L E .

6.3. Novelty The decision metrics for Novelty seem to be linear, provided that both factors are evaluated as equally important. Assigning the weights 1, 2 and 3 to the values of both factors, it is evident that the scores 6 and 5 give the result GOOD, score 4 gives FAIR, and score 2 gives NONE. The only uncertainty applies to the sum 3. For an expert, in any particular case, other factors will help him/her to decide whether the verdict is FAIR or NONE. For a computer program, some formal

40

BORIS TAMM:

EVALUATION

condition has to be added. So, with this uncertainty, regarding the score 3, one can write

Novelty = {POOR}213{FAIR}4]5{GOOD}6 6.4. Suitability The decision space for Suitability is narrowed because the Format factor has a veto influence on the decision. So it participates in the decision matrix with only one value, and correspondingly, the rule base for Suitability consists of only nine rules. One can now assign numerical weights to each value of the three factors that compose Suitability:

NONE FAIR GOOD 0 2 4 POOR I M P R O V A B L E GOOD Clarity 0 2 3 UNACCEPT. A C C E P T Format (0) 1 Novelty

and examine the scores of all decisions comprising the decision space. Assigning the rows by N, C, and F and columns by X, Y, and Z, the scores in Table 4 are obtained. From this table, it can be seen that scores of < 5 give a negative result: REJECTED, and scores of 6 and more could be related to GOOD papers. Thus, only papers scoring 5 points can be referred as

A C C E P T A B L E WITH MINOR IMPROVEMENTS: Suitability = (O){REJECTED} 4[5{ACC.W.M.1MPR}

516{GOOD}8, where (0) stands for the veto coming from Format. 7. DISCUSSION Is this a satisfactory result.'? To answer this question, one needs to return to the problem and the procedures. The aim of the task of the IPC is to contribute to the highest possible technical level of the planned event, i.e. it has to exclude low-level papers, extract the best ones, and give more opportunities to the authors whose contributions are interesting or promising, but for some reason do not yet meet the necessary standards. If the referees specify how these papers can be improved and write pertinent instructions it very often helps. ~ In this case, the class of improvable papers seems to T a b l e 4. N u m e r i c a l s c o r e s o f s u i t a b i l i t y NX-CXNX- CYNX- CZ NY- CXNY- CYNY- CZ NZ - CXNZ - CYNZ - CZ -

FY= FY = FY = FY = FY = FY = FY = FY = FY =

I(1) 3(2) 4(3) 3(4) 5(5) 6(6) 5(7) 7(8) 8(9)

OF TECHNICAL

PAPERS

be very narrow. Formally this is true, but in fact, the number of papers obtaining 5 points is not so small (according to the Novelty and Clarity decision bases), while the papers receiving GOOD and scoring 6 and even 7, could also be furnished with some remarks from the evaluators. This also results from the analysis of the evaluation factors, and seems to be acceptable. Expert reasoning will always contain some uncertainties. An example is the third rule of the Suitability rule base, which is the only one rejected with 4 points. The reason is that the best Novelty was valued a little higher than the best Clarity. There were good reasons for doing so, but if another expert equalizes them and those papers pass the rejection threshold, one cannot blame him/ her. In the example being studied here, one of the most important prerequisites of a good paper for a given event--whether it is within the scope of the event--was related to the Format factor. With only two possible values: YES or NO, it turned out to be the veto condition. If the technical domain of a workshop or a conference is precisely specified, this is usually appropriate. However, the boundaries of the scope of some events are ambiguous, and leave more freedom for interpretation. Then the property of relevance to the scope for each paper may be structured as a separate factor, subdivided, in turn, by several attributes, such as: appropriate t a little outside the scope marginal relevance scope has a very small group of interest) This problem is more complicated with larger events, where the technical program is divided into different sections. There are two visible solutions: either to establish an IPC for each section, as is the experience of the IFAC and IFIP World Congresses, or to decide upon the marginal papers at the Committee meeting after receiving all the referees' opinions. This can now also be undertaken by means of e-mail conferencing. The final remark relates to the confidence level of the final decision, Suitability. Different experts have probably different confidence, or even competence, relating to different papers. This means that the final decisions made by the experts might not be equal, even if they have the same value. This is usually the case when the scope is rather wide, or the methods and formalisms unfamiliar to the evaluators are diverse. Therefore the IPCs sometimes want to acquire additional information about the evaluators' confidence, for instance through multiple choice. A: B: C:

I have understood the paper. I have some confidence in my judgement, but not firm. Do not attach much weight to my score, but I did my best.

BORIS TAMM: EVALUATION OF TECHNICAL PAPERS T h e Confidence p r o b l e m c o u l d b e d i s t r i b u t e d o v e r t h e w h o l e k n o w l e d g e - b a s e s t r u c t u r e , b u t it w o u l d c o m plicate the formal structure of the evaluation process (as well as t h e c o m p u t a t i o n s ) b y at least an o r d e r o f m a g n i t u d e , w i t h o u t p r o m i s i n g a n y r e a s o n a b l e effect. It w o u l d m e r e l y b e like s h o o t i n g s p a r r o w s with a c a n n o n - - a s i t u a t i o n to b e a v o i d e d b y a n y e x p e r t system!

4. 5. 6.

7. 8.

REFERENCES 1. Page-Jones M. The Practical Guide to Structures Systems Design, 2nd edn. Yourdon Press, New York (1988). 2. Yourdon E. Modern Structured Analysis. Prentice-Hall, Englewood Cliffs, NJ (1989). 3. Webb M. and Ward P. Executable dataflow diagrams: an experi-

9.

41

mental implementation. Structured Development Forum, Seattle (1986). Winograd T. Extended inference modes in reasoning by computer systems. Artif. lntell. 13, 5-26 (1980). Tamm B. Some fragments of expert system technology. Proc. Estonian Acad. Sci. 42, 77-89 (1993). Kolonder J. L. Towards an understanding of the role of experience in the evolution from novice to expert. In Developments in Expert Systems (Edited by Coomles M. J.). Academic Press, London (1984). Keller R. Expert System Technology. Development and Application. Yourdon Press, Prentice-Hall, Englewood Cliffs, NJ (1987). Castelio E. and Alvarez E. Expert Systems: Uncertainty and Learning. Computational Mechanics, Elsevier Applied Science, Oxford (1991). Davis R. TEIRESIAS: experiments in communication with a knowlege-based expert system. In Designing for Human-Computer Communication (Edited by Sime M. E. and Coombs M. J.). Academic Press, London (1983).

AUTHOR'S BIOGRAPHY Boris Tnmm graduated from Tallinn Technical University in 1954 as an electrical engineer. He received a Ph.D. at the Institute of Automatics and Telemechanics in Moscow in 1961, and a Doctor of Science degree in 1969. A member of the Estonian Academy of Science, the Royal Swedish Academy of Engineering Sciences and the Finnish Academy of Technology, and Doctor Honoris Causa (Budapest Technical University and Helsinki University of Technology), he has over 100 publications on information processing, software engineering, C A D and automatic control, including three monographs and one encyclopaedia. A t present, he is Chief Scientist at the Institute of Cybernetics, Estonian Academy of Sciences, and a professor at TaUinn Technical University. He is a Past President and Advisor of the International Federation of Automatic Control (IFAC). His most recent publications are in the fields of adaptive control and expert-system technology.